We use VirusTotal API in our projects

We use VirusTotal API in our projects

To use VirusTotal software interfaces without any restrictions, you need to get a key that costs a serious amount – prices start from 700 euros per month. And a private person will not be given a key even if he is ready to pay for it.

However, you shouldn’t despair, because the service provides the main functions for free and limits us only by the number of requests – not more than two per minute. Well, we will have to put up with it.



Get the Key API

So, the first thing we need is registration on the site. There are no problems – I’m sure you can do it. After registration we take the access key, going to the menu item API key.

Versions of API

The current version of the API now has number 2. However, a new version already exists – number 3. This version of the API is still in beta, but it can already be used, especially since the features it provides are much wider.

The developers still recommend to use the third version only for experiments or for non-critical projects. We will parse both versions. The access key is the same for them.



API VirusTotal. Version 2

As with other popular web services, work with the API consists of forwarding requests via HTTP and receiving responses.

The second version API allows:

  • Send files for verification;
  • receive report on previously validated files using file ID (file hash SHA-256, SHA-1 or MD5 or value scan_id from response received after sending the file);
  • Send URL for scanning to server;
  • receive report on previously validated addresses using either the URL itself or the scan_id value from the response received after the URL was sent to the server;
  • receive report by IP address;
  • receive report by domain name.

Errors

If the request has been processed correctly and no errors have occurred, code 200 (OK) will be returned.

If an error has occurred, there may be such options:

    • 204 – error of type Request rate limit exceeded. It occurs when the quota of allowed number of requests is exceeded (for free key the quota is four requests per minute);
    • 400 – error of Bad request type. It occurs when a request is generated incorrectly, e.g. if there are no valid arguments or they have invalid values;
    • 403 – error of Forbidden type. It occurs if you try to use API functions that are only available with a paid key when it is not.

If the request is correctly generated (HTTP status code 200) the response will be an object JSON with at least two fields in its body:

  • response_code – if the requested object (file, URL, IP address or domain name) is in the VirusTotal database (i.e. checked before) and information about this object can be obtained, the value of this field will be equal to one; if the requested object is in the analysis queue, the value of this field will be -2; if the requested object is not in the VirusTotal database, the value of this field will be equal to zero;
  • verbose_msg provides a more detailed description of the value of response_code (e.g. Scan finished, information embedded after sending the file for scanning).

Other information contained in JSON response object depends on which API function was used.

Send file to server for scanning

To send a file for scanning, you need to form a POST request to https://www.virustotal.com/vtapi/v2, and in the request you need to specify the API access key and transfer the file itself (there is a limit on the file size – no more than 32 Mbytes). This may look like this (use Python):

import json
import requests
...
api_url = 'https://www.virustotal.com/vtapi/v2/file/scan'
params = dict(apikey='<access key>')
with open('<path to file>', 'rb') as file:
  files = dict(file=('<path to file>', file))
  response = requests.post(api_url, files=files, params=params)
if response.status_code == 200:
  result=response.json()
  print(json.dumps(result, sort_keys=False, indent=4))
...

Here, instead of <access key>, you need to insert your API access key, and instead of <file path>, the path to the file you will be sending to VirusTotal. If you don’t have a request library, set it to pip install requests

In reply, if everything went well and the HTTP status code is 200, we will get about this picture:

{
  "response_code": 1,
  "verbose_msg": "Scan request successfully queued, come back later for the report",
  "scan_id": "275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f-1577043276",
  "resource": "275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f",
  "sha1": "3395856ce81f2b7382dee72602f798b642f14140",
  "md5": "44d88612fea8a8f36de82e1278abb02f",
  "sha256": "275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f",
  "permalink": "https://www.virustotal.com/file/275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f/analysis/1577043276/"  
}

Here we see the values of response_code and verbose_msg, as well as file hashes from SHA-256, SHA-1, and MD5, a link to the file scan results from permalink, and the file identifier scan_id.

Get a report on your last scan of the file

.
Using any hash or scan_id value from your answer, you can get a report on the last scan of the file (if the file was already uploaded to VirusTotal). To do this, you need to generate a GET request and specify an access key and file ID in the request. For example, if we have scan_id from the previous example, the query will look like this:

import json
import requests
...
api_url = 'https://www.virustotal.com/vtapi/v2/file/report'
params = dict(apikey='<ключ доступа>', resource='275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f-1577043276')
response = requests.get(api_url, params=params)
if response.status_code == 200:
  result=response.json()
  print(json.dumps(result, sort_keys=False, indent=4))
...

If successful, we will see the following in response:

{
  "response_code": 1,
  "verbose_msg": "Scan finished, information embedded",
  "resource": "275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f",
  "sha1": "3395856ce81f2b7382dee72602f798b642f14140",
  "md5": "44d88612fea8a8f36de82e1278abb02f",
  "sha256": "275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f",
  "scan_date": "2019-11-27 08:06:03",
  "permalink": "https://www.virustotal.com/file/275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f/analysis/1577043276/",
  "positives": 59,
  "total": 69,
  "scans": {
    "Bkav": {
      "detected": true,
      "version": "1.3.0.9899",
      "result": "DOS.EiracA.Trojan",
      "update": "20191220"
    },
    "DrWeb": {
      "detected": true,
      "version": "7.0.42.9300",
      "result": "EICAR Test File (NOT a Virus!)",
       "update": "20191222"
    },
    "MicroWorld-eScan": {
      "detected": true,
      "version": "14.0.297.0",
      "result": "EICAR-Test-File",
      "update": "20191222"
    },
    ...
  ...
  "Panda": {
    "detected": true,
    "version": "4.6.4.2",
    "result": "EICAR-AV-TEST-FILE",
    "update": "20191222"
  },
  "Qihoo-360": {
    "detected": true,
    "version": "1.0.0.1120",
    "result": "qex.eicar.gen.gen",
    "update": "20191222"
  }
}

Here, as in the first example, we get the file hash values, scan_id, permalink, response_code and verbose_msg. We also see the scan results of the file with antivirus and the overall score of total – how many antivirus engines were involved and positives – how many antiviruses gave a positive verdict.

To display the scan results of all antiviruses in a readable form, you can, for example, write something like this:

import requests
...
api_url = 'https://www.virustotal.com/vtapi/v2/file/report'
params = dict(apikey='<ключ доступа>', resource='275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f-1577043276')
response = requests.get(api_url, params=params)
if response.status_code == 200:
  result=response.json()
  for key in result['scans']:
    print(key)
    print(' Detected: ', result['scans'][key]['detected']).
    print(' Version: ', result['scans'][key]['version']).
    print(' Update: ', result['scans'][key]['update']).
    print(' Result: ', 'result['scans'][key]['result']).
...

Send URL to server for scanning

.
To send a URL for scanning, we need to generate and send a POST request containing the access key and the URL itself:

import json
import requests
...
api_url = 'https://www.virustotal.com/vtapi/v2/url/scan'
params = dict(apikey='<access key>', url='https://xakep.ru/author/drobotun/').
response = requests.post(api_url, data=params)
if response.status_code == 200:
  result=response.json()
  print(json.dumps(result, sort_keys=False, indent=4))
...

In response, we will get about the same as when we send the file, except for the hash values. The contents of the scan_id field can be used to retrieve the scan report for this URL.

Get a report of URL scan results

.
Let’s form a GET request with the access key and specify either the URL itself as a string or the value of scan_id obtained with the previous function. This will look like this:

import json
import requests
...
api_url = 'https://www.virustotal.com/vtapi/v2/url/report'
params = dict(apikey='<access key>', resource='https://xakep.ru/author/drobotun/', scan=0).
response = requests.get(api_url, params=params)
if response.status_code == 200:
  result=response.json()
  print(json.dumps(result, sort_keys=False, indent=4))
...

In addition to the access key and the URL string, there is an optional parameter scan – by default it is zero. If its value is one, then when there is no information about the requested URL in the VirusTotal database (the URL hasn’t been checked before), this URL will be automatically sent to the server for checking, after which we will get the same information in response as when the URL was sent to the server. If this parameter is zero (or not specified), we will get a report about this URL or (if there is no information about it in the VirusTotal database) a response of this kind:

{
  "response_code": 0,
  "resource": "<requested URL>",
  "verbose_msg": "Resource does not exist in the dataset".
} 

Get information about IP addresses and domains

.
To check IP addresses and domains, you need to generate and send a GET request with a key, the name of the domain to be checked or IP as a string. To check the domain it looks like this:

....
api_url = 'https://www.virustotal.com/vtapi/v2/domain/report'.
params = dict(apikey='<access key>', domain=< 'domain name'>)
response = requests.get(api_url, params=params)
...

To check the IP address:

...
api_url = 'https://www.virustotal.com/vtapi/v2/ip-address/report'.
params = dict(apikey='<access key>', ip=<'IP address'>)
response = requests.get(api_url, params=params)
...

The answers to such requests are extensive and contain a lot of information. For example, for IP 178.248.232.27 (this is the Hacker IP), the beginning of the report received from the VirusTotal server looks like this:

{
  "country": "RU",
  "response_code": 1,
  "as_owner": "HLL LLC",
  "verbose_msg": "IP address in dataset",
  "continent": "EU",
  "detected_urls": [
    {
    "url": "https://xakep.ru/author/drobotun/",
    "positives": 1,
    "total": 72,
    "scan_date": "2019-12-18 19:45:02".
    },
    {
    "url": "https://xakep.ru/2019/12/18/linux-backup/",
    "positives": 1,
    "total": 72,
    "scan_date": "2019-12-18 16:35:25"
    },
    ...
  ]
}

API VirusTotal. Version 3

API VirusTotal.
The third version of the API has much more features than the second – even with a free key. Moreover, when experimenting with the third version, I did not notice that the number of objects (files or addresses) uploaded to the server within a minute was limited. It seems that the restrictions in beta do not apply at all yet.

The functions of the third version of the API are designed with the principles REST and are easy to understand. The access key here is passed in the request header.

Errors

In the third version of the API, the list of errors (and, accordingly, the HTTP status codes) has expanded. It was added:

  • 401 – an error of type User Not Active Error, it occurs when the user account is inactive;
  • 401 – error of Wrong Credentials Error type, occurs if wrong access key is used in the request;
  • 404 Not Found Error occurs when the requested analysis object is not found;
  • 409 – Already Exists Error type error, occurs when the resource already exists;
  • 429 – an error of Quota Exceeded Error type, occurs when one of the quotas exceeds the number of requests (minute, daily or monthly). As I said before, during my experiments, there was no limit on the number of requests per minute, although I used the free key;
  • 429 – an error of the Too Many Requests Error type, occurs with a large number of requests in a short time (can be caused by server load);
  • 503 – an error of the Transient Error type, a temporary error of the server, at which a repeated attempt of the request may work.

In case of error, in addition to the status code, the server returns additional information in JSON form. However, as it turned out, not for all HTTP status codes: for example, for error 404 the additional information is a normal line.

The JSON format for the error is as follows:

{
  "error": {
    "code": "<HTTP&gt status code;",
    "message": "<error&gt description message;"
  }
}

Functions for working with files

The third version of the API allows:

  • download files for analysis to server;
  • retrieve a URL to upload a file larger than 32 MB to the server;
  • receive reports on file analysis;
  • reanalyzes the file
  • receive VirusTotal users comments on the desired file;
  • send your comment to a specific file;
  • view the voting results for a particular file;
  • vote for file;
  • get advanced information about the file.
  • To upload a file to the server, you need to send it via POST request. This can be done as follows:
.... 
api_url = 'https://www.virustotal.com/api/v3/files'.
headers = {'x-apikey' : '<API&gt access key;'}
with open('<path to file>', 'rb') as file:
  files = {'file': ('<path to file>', file)}.
  response = requests.post(api_url, headers=headers, files=files)
...

In return, we will get the following:

{
  "data": {
    "id": "ZTRiNjgxZmJmZmRkZTNlM2YyODlkMzk5MTZhZjYwNDI6MTU3NzIxOTQ1Mg==",
    "type": "analysis"
  }
}

Here we see the value id, which serves as the file identifier. This identifier should be used to get information about analyzing a file in GET requests like /analyses (we’ll talk about that later).

To get the URL to download a large file (over 32 Mbytes), you need to send a GET request that says https://www.virustotal.com/api/v3/files/upload_url as the URL. We insert the access key in the header:

....
api_url = 'https://www.virustotal.com/api/v3/files/upload_url'.
headers = {'x-apikey' : '<API&gt access key;'}
response = requests.get(api_url, headers=headers)
...

In reply we will get JSON with the address where we should upload the file for analysis. The resulting URL can only be used once.

To get information about a file that the service has already analyzed, you need to make a GET-request with the file ID in the URL (it can be a SHA-256, SHA-1 or MD5 hash). As in previous cases, we specify the access key in the header:

....
api_url = 'https://www.virustotal.com/api/v3/files/<file identifier value>'.
headers = {'x-apikey' : '<API&gt access key;'}
response = requests.get(api_url, headers=headers)
...

In return, we’ll get a file check report, where in addition to the scan results of all VirusTotal anti-viruses, there will be a lot of additional information, the composition of which depends on the type of file checked. For example, for executable files, you can see information about such attributes:

{
  "attributes": {
    "authentihash": "8fcc2f670a166ea78ca239375ed312055c74efdc1f47e79d69966461dd1b2fb6",
    "creation_date": 1270596357,
    "exiftool": {
      "CharacterSet: "Unicode",
      "CodeSize": 20480,
      "CompanyName": "TYV",
      "EntryPoint": "0x109c",
      "FileFlagsMask": "0x0000",
      "FileOS": "Win32",
      "FileSubtype": 0,
      "FileType": "Win32 EXE",
      "FileTypeExtension": "exe",
      "FileVersion": 1.0,
      "FileVersionNumber": 1.0: "1.0.0.0",
      "ImageFileCharacteristics": "No relocs, Executable, No line numbers, No, 32-bit",
      ...
      ...
      "SubsystemVersion": 4.0,
      "TimeStamp": "2010:04:07 00:25:57+01:00",
      "UninitializedDataSize": 0.
    },
    ...
  }
}

Or, for example, information about sections of an executable file:

{
  "sections": [
    {
      "entropy": 3.94,
      "md5": "681b80f1ee0eb1531df11c6ae115d711",
      "name": ".text",
      "raw_size": 20480,
      "virtual_address": 4096,
      "virtual_size": 16588
    },
    {
      "entropy": 0.0,
      "md5": "d41d8cd98f00b204e9800998ecf8427e",
      "name": ".data",
      "raw_size": 0,
      "virtual_address": 24576,
      "virtual_size": 2640
    },
    ...
  }
}

If the file hasn’t been uploaded to the server before and hasn’t been analyzed yet, we’ll get an error like Not Found Error with HTTP status code equal to 404:

{
  "error": {
    "code": "NotFoundError",
    "message": "File \"<file&gt identifier;" not found"
  }
}

To re-analyse a file, you also need to send a GET request to the server, where we place the file ID in the URL and add /analyse at the end:

....
api_url = 'https://www.virustotal.com/api/v3/files/<file identifier value>/analyse'.
headers = {'x-apikey' : '<API&gt access key;'}
response = requests.get(api_url, headers=headers)
...

The answer will include the same file descriptor as in the first case – when uploading the file to the server. And as in the first case, the identifier from the descriptor can be used to retrieve information about the analysis of the file through a /analyses GET request.

You can view comments from users of the service, as well as the voting results on the file by sending the appropriate GET-request to the server. For comments:

....
api_url = 'https://www.virustotal.com/api/v3/files/<file identifier value>/comments'.
headers = {'x-apikey' : '<API&gt access key;'}
response = requests.get(api_url, headers=headers)
...

To obtain the voting results:

...
api_url = 'https://www.virustotal.com/api/v3/files/<file>/votes' identifier value
headers = {'x-apikey' : '<API&gt access key;'}
response = requests.get(api_url, headers=headers)
...

In both cases, you can use the optional limit parameter, which determines the maximum number of comments or votes in a response. For example, you could use this parameter:

....
limit = {'limit': str(<number of votes in response>)}
api_url = 'https://www.virustotal.com/api/v3/files/<file identifier value>/votes'.
headers = {'x-apikey' : '<API&gt access key;'}
response = requests.get(api_url, headers=headers, params=limit)
...

To place your comment or vote for the file, we create a POST request, and the comment or vote is passed as JSON object:

....
## To send the voting results
= {'data': {'type': 'vote', 'attributes': {'verdict': <'malicious' or 'harmless'>}}}.
api_url = 'https://www.virustotal.com/api/v3/files/<file&gt identifier value;/votes'.
headers = {'x-apikey' : '<API&gt access key;'}
response = requests.post(api_url, headers=headers, json=votes)
...
## To send a comment
comments = {'data': {'type': 'vote', 'attributes': {'text': <comment text>}}}.
headers = {'x-apikey' : '<API&gt access key;'}
api_url = 'https://www.virustotal.com/api/v3/files/<file&gt identifier value;/comments'.
response = requests.post(api_url, headers=headers, json=comments)
...

For more information about the file, you can ask for details about related objects. In this case, objects can characterize, for example, the behavior of a file (object behaviours) or URL, IP addresses, domain names (objects contacted_urls, contacted_ips, contacted_domains).

The most interesting object is behaviours. For example, for executable files, it will include information about modules being loaded, processes being created and run, file system and registry operations, and network operations.

To get this information, we send a GET request:

api_url = 'https://www.virustotal.com/api/v3/files/<file identifier value>/behaviours'.
headers = {'x-apikey' : '<API&gt access key;'}
response = requests.get(api_url, headers=headers)

The answer will be JSON object with information about file behavior:

{
  "data": [
    {
      "attributes": {
        "analysis_date": 1548112224,
        "command_executions": [
          "C:\\WINDOWS\\system32\\ntvdm.exe -f -i1",
          "/bin/bash /private/tmp/eicar.com.sh"
        ],
        "has_html_report": false,
        "has_pcap": false,
        "last_modification_date": 1577880343,
        "modules_loaded": [
          "c:\\windows\\system32\\user32.dll",
          "c:\\windows\\system32\\imm32.dll",
          "c:\\windows\\system32\\ntdll.dll"
        ]
      },
      ...
    }
  ]
}

Functions for working with URL

.
The list of possible operations with URL are included:

  • send URL to server for analysis;
  • receive URL information;
  • URL analysis;
  • receive VirusTotal user comments from the desired URL;
  • Send your comments to a specific URL;
  • receive results of voting at a specific URL;
  • Send your vote to any URL;
  • receive advanced URL information;
  • Get information about the domain or IP address of the desired URL.

Most of the specified operations (except for the last one) are performed similarly to the same operations with files. The URL identifier can be either a string with a URL encoded in Base64 without additional “equal” characters, or a SHA-256 hash from the URL. This can be implemented as follows:

## For Base64
import base64
...
id_url = base64.urlsafe_b64encode(url.encode('utf-8')).decode('utf-8').rstrip('=')
...
## For SHA-256
import hashlib
...
id_url = hashlib.sha256(url.encode()).hexdigest()

To send a URL for analysis, you need to use a POST request:

data = {'url': '<string with URL&gt name;'}
api_url = 'https://www.virustotal.com/api/v3/urls'.
headers = {'x-apikey' : '<API&gt access key;'}
response = requests.post(api_url, headers=headers, data=data)

In response we will see the URL descriptor (similar to the file descriptor):

{
  "data": {
    "id": "u-1a565d28f8412c3e4b65ec8267ff8e77eb00a2c76367e653be774169ca9d09a6-1577904977",
    "type": "analysis"
  }
}

The identifier id from this descriptor is used to retrieve information about analyzing a file through a GET request like /analyses (about this request closer to the end of the article).

You can get information about domains or IP addresses associated with a URL by using a GET request like /network_location (here we use Base64 or SHA-256 URL ID):

api_url = 'https://www.virustotal.com/api/v3/urls/<URL identifier (Base64 or SHA-256)>/network_location'.
headers = {'x-apikey' : '<API&gt access key;'}
response = requests.post(api_url, headers=headers)

Other operations with the URL are performed in the same way as similar operations with files.

Functions for domains and IP addresses

.
This list of functions includes:

  • receive domain or IP address information;
  • receive comments from VirusTotal users at the desired domain or IP address;
  • send your comments to a specific domain or IP address;
  • receive voting results for a particular domain or IP address;
  • vote for domain or IP address;
  • receive advanced domain or IP address information.

All these operations are implemented similarly to the same file or URL operations. The difference is that here domain names or IP address values are used directly, not their identifiers.

For example, you can get information about www.xakep.ru in this way:

api_url = 'https://www.virustotal.com/api/v3/domains/www.xakep.ru'.
headers = {'x-apikey' : '<API&gt access key;'}
response = requests.get(api_url, headers=headers)

And, for example, look at comments on the IP address 178.248.232.27 – like this:

api_url = 'https://www.virustotal.com/api/v3/ip_addresses/178.248.232.27/comments'.
headers = {'x-apikey' : '<API&gt access key;'}
response = requests.get(api_url, headers=headers)

GET request type /analyses

Such a request allows you to get information about the results of the analysis of the files or URL after uploading them to the server or after reanalysis. You should use the identifier contained in the id field of the file descriptor, or the URL that results from uploading the file or URL to the server or from parsing the file or URL again.

For example, you could generate a similar request for a file like this:

TEST_FILE_ID = 'ZTRiNjgxZmJmZmRkZTNlM2YyODlkMzk5MTZhZjYwNDI6MTU3NjYwMTE1Ng=='.
...
api_url = 'https://www.virustotal.com/api/v3//analyses/' + TEST_FILE_ID
headers = {'x-apikey' : '<API&gt access key;'}
response = requests.get(api_url, headers=headers)

And the option for the URL:

TEST_URL_ID = 'u-dce9e8fbe86b145e18f9dcd4aba6bba9959fdff55447a8f9914eb9c4fc1931f9-1576610003'
...
api_url = 'https://www.virustotal.com/api/v3//analyses/' + TEST_URL_ID
headers = {'x-apikey' : '<API&gt access key;'}
response = requests.get(api_url, headers=headers)

Exclusion

We went through all the main functions of the VirusTotal API service. You can borrow this code for your projects. If you use the second version, you will need to make sure not to send requests too often, but in the third version there is no such restriction so far. I recommend to choose this one, because the possibilities are much wider here too. In addition, sooner or later it will become the main one.



WARNING! All links in the articles may lead to malicious sites or contain viruses. Follow them at your own risk. Those who purposely visit the article know what they are doing. Do not click on everything thoughtlessly.


10 Views

0 0 vote
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments


Do NOT follow this link or you will be banned from the site!
0
Would love your thoughts, please comment.x
()
x

Spelling error report

The following text will be sent to our editors: