API Documentation
Attribution Api
Keep in mind that only endpoints in this image are exposed to interact with.
They are labelled with External, while other images have Internal, in
their descriptions.
/prediction/status/sha256_hash
Example return
Details
{
"status": {
"FESTIVE Func Sim": "SUCCESS"
},
"detailed_status": {
"FESTIVE Func Sim": {
"PREPROCESSING": "DONE",
"SEARCH": "SUCCESS",
"ERROR_DETAILS": ""
}
}
}
| Key | Typing |
|---|---|
| status | dict[str, str] |
| detailed_status | dict[str, dict[str, str]] |
| Values of 'status' | Explanation |
|---|---|
| SUCCESS | Successful task |
| PENDING | Task in progress or given id cannot be found |
| FAILURE | Task returned an error |
| RETRY | RETRY becomes FAILURE after X times (set in .env, default 3) |
| REVOKED | Ongoing task being cancelled is REVOKED status |
| None | None if race condition and celery_task_id is still empty |
| Values of 'detailed_status' | Explanation |
|---|---|
| PREPROCESSING | PREPROCESSING is the internal model status, values are not explicitly defined |
| SEARCH | SEARCH is the same as attri-api status, values are either SUCCESS, PENDING, FAILURE, RETRY or REVOKED |
| ERROR_DETAILS | ERROR_DETAILS is a string if there is error msg returned, or ““. |
Other examples
Example of return if missing db entries
Details
{
...
"detailed_status": {
"FESTIVE Func Sim": {
"PREPROCESSING": null,
...
}
}
}
Example of return with revoked tasks
Details
{
...
"detailed_status": {
"FESTIVE Func Sim": {
...
"SEARCH": "REVOKED",
"ERROR_DETAILS": "Prediction Task is revoked"
}
}
}
Example of no previous submission return
Details
{
"status": {},
"detailed_status": {}
}
/prediction/revoke/model_name/sha256_hash
Example return
Details
{
"detail": "Prediction Task [50438139-b7c4-40e1-9c28-4ea06955e3c7] revoked"
}
| Responses |
|---|
Prediction Task [celery_task_id] revoked |
| Prediction Task already completed or been revoked |
| Prediction Task not found |
/prediction/model_name/sha256_hash
Example return
Details
{
"model": "FESTIVE Func Sim",
"error_details": "",
"sha256_hash": "b94c9061de2af1e4ac2abe7e2d350fdb7924e3de21cbb84c4f5a0d456dbd45d9",
"explanation_results": null,
"predicted_at": "2025-01-21T02:49:13.454923",
"celery_task_id": "33f86794-c428-4e24-a40f-963c79a67e83",
"prediction_results": {
"1420b4c4ad6c9efdaa9cfdf26816f14bf78f3d81ff81c661c7cf9cd28b13e84e": {
"hash": "1420b4c4ad6c9efdaa9cfdf26816f14bf78f3d81ff81c661c7cf9cd28b13e84e",
"name": "frame_dummy",
"source_code": "long long function_1() {\n return register_tm_clones();\n}\n",
"weaviate_uuid": "70ff7ffb-2af5-4ed6-bf13-fbd010a71141",
"weaviate_data_source": "USER",
"similarity": {
"6771": {
"name": "frame_dummy",
"package": "binutils",
"version": "2.30",
"level": "O0",
"id": "6771",
"source_code": "long long function_1() {\n return register_tm_clones();\n}\n\n",
"score": 1
}, # only highest similarity function has information
"345558": {...},
"183205": {...},
"183208": {...},
"171187": {...}
# only top 5 are returned
},
},
... # more functions from the submitted sample
} # end of prediction_results
}
Other examples
Example of an erroneous return
Details
{
"model": "FESTIVE Func Sim",
"error_details": "<Some error message returned from the component that experienced failure>",
"sha256_hash": "b94c9061de2af1e4ac2abe7e2d350fdb7924e3de21cbb84c4f5a0d456dbd45d9",
"explanation_results": null,
"predicted_at": "2025-01-21T02:49:13.454923",
"celery_task_id": "33f86794-c428-4e24-a40f-963c79a67e83",
"prediction_results": null
}
Example of no decompiled functions return
Details
{
"model": "FESTIVE Func Sim",
"error_details": "",
"sha256_hash": "b94c9061de2af1e4ac2abe7e2d350fdb7924e3de21cbb84c4f5a0d456dbd45d9",
"explanation_results": null,
"predicted_at": "2025-01-21T02:49:13.454923",
"celery_task_id": "33f86794-c428-4e24-a40f-963c79a67e83",
"prediction_results": {}
}
/submit/sample
Example return
Details
{
"sample_id": "33f493236245b155335b3d772d7a8bdb9194110d1e4ce052a3b6bdfcc526f680"
}
/version/all
Example return
Details
{
"description": "Follow attribution-api for the overall 'version'",
"attribution-api": "2.4.0",
"crystal-ball": "2.4.0",
"prediction-store": "2.4.0",
"FESTIVE": {
"frontend": "0.4.0",
"app": {
"app": "0.4.0",
"ida_api": "0.2.0"
},
"store": {
"store": "0.4.0",
"postgres": "16.6",
"weaviate": "1.25.30"
}
}
}
Crystal Ball
This is not the same crystal ball as the old attribution-engine, but it served a similar purpose so the name was kept.
/queue/status/task_id
Example return
Details
{
"id": "8adafa7c-0156-45f1-b1ff-936361088065"
"status": "PENDING"
}
/revoke/model_name/sha256_hash/task_id
Example return
Details
{
"id": "8adafa7c-0156-45f1-b1ff-936361088065"
}
/queue/FESTIVE Func Sim/sha256_hash
Example return
Details
{
"id": "8adafa7c-0156-45f1-b1ff-936361088065"
}
/version/crystal-ball
Example return
Details
{
'crystal-ball': '2.4.0'
}
Prediction Store
This container interfaces with all the DB operations.
/create/prediction/sha256_hash/model
return None
/create/link/celery_task_id/sample_uuid
return None
/read/predictions/sha256_hash
If there’s 2 entries, of two different models, the below would be returned.
Example return
Details
[
{
"model": "Some Model 1",
"error_details": "",
"sha256_hash": "33f493236245b155335b3d772d7a8bdb9194110d1e4ce052a3b6bdfcc526f680",
"explanation_results": { ... },
"predicted_at": "2024-09-26T04:12:28.452211",
"celery_task_id": "54c15b7a-a2f4-4c32-8d58-457fa6c76d97",
"prediction_results": { ... }
},
{
"model": "Some Model 2",
"error_details": "",
"sha256_hash": "fd2e2c6612d43bb6b213b72fc53f07d73d99059fa72c96e44bde12e7815073ae",
"explanation_results": null,
"predicted_at": "2024-10-18T07:24:56.645841",
"celery_task_id": "c394df26-b15b-46bf-a0f0-fb5b831c4b5b",
"prediction_results": { ... }
}
]
/read/prediction/model_name/sha256_hash
Example return
Details
{
"model": "FESTIVE Func Sim",
"error_details": "",
"sha256_hash": "33f493236245b155335b3d772d7a8bdb9194110d1e4ce052a3b6bdfcc526f680",
"explanation_results": null,
"predicted_at": "2024-09-26T04:12:28.452211",
"celery_task_id": "54c15b7a-a2f4-4c32-8d58-457fa6c76d97",
"prediction_results": { ... }
}
/read/link/celery_task_id
Example return
Details
{
"celery_task_id": "54c15b7a-a2f4-4c32-8d58-457fa6c76d97",
"sample_uuid": "0ec00790-47f4-443e-9372-f7b458597b19"
}
/update/prediction/sha256_hash/model/celery_task_id
Example return
Details
{
'celery_task_id': '8adafa7c-0156-45f1-b1ff-936361088065',
'predictions_sha256_hash': '5a09b5d00e111f10d02564dad0cc4e28023f4ac1bb57076c8670e80fee57e5d1',
'predictions_model': 'FESTIVE Func Sim'
}
/update/prediction/sha256_hash/model
return None
/update/explanation/sha256_hash/model
return None
/delete/prediction/sha256_hash/model
return None
/version/prediction-store
Example return
Details
{
'prediction-store': '2.4.0'
}
Message Queue
Celery
Celery uses rabbitmq as its broker and redis as its backend.
Check out this link https://docs.celeryq.dev/en/latest/getting-started/backends-and-brokers/index.html#broker-overview
RabbitMQ
You can access it through the web browser and visit http://localhost:15672 with credentials guest:guest
This only works if you are using the docker-compose-dev.yaml where ports are exposed.
Redis
Terminal
You can login with redis-cli -u redis://default:PASSWORD@localhost:6379/0
The syntax for the uri is redis://user:password@host:port/dbnum or you can use redis-cli --help which is dynamically configured to show you your login uri.
If there is any tasks in queue, you can see it with command keys *
You can run GET {key_name} to view the values
Database
Accessing the database
You can either
-
use docker desktop to access the terminal
-
connect to the terminal from bash terminal
$ docker exec -it attribution-database bash
Login to DB, it will prompt for password.
/# psql -U *** -d ***
You can also check table names once you’re logged in.
***=# \dt
List of relations
Schema | Name | Type | Owner
--------+-------------+-------+-------
public | predictions | table | ***
(1 row)
You can query from the table if you need to check anything.
select * from predictions;
Remember to close the statement with a semi-colon ;
Tables
predictions
| Column_name | Type | Description |
|---|---|---|
| sha256_hash | CHARACTER VARYING NOT NULL | The binary’s hash |
| model | CHARACTER VARYING NOT NULL | Which model used for prediction |
| predicted_at | TIMESTAMP NOT NULL | Auto-inputted upon prediction |
| error_details | CHARACTER | Empty unless there’s error returned |
| celery_task_id | CHARACTER NOT NULL | The id of the celery task |
| prediction_results | JSONB | The results of the model’s prediction |
| explanation_results | JSONB | The explanation of prediction |
links
| Column_name | Type | Description |
|---|---|---|
| celery_task_id | CHARACTER VARYING NOT NULL | The id of the celery task |
| sample_uuid | CHARACTER VARYING NOT NULL | The uuid (or id) that’s used as PK in the model’s internal DB |