# API Reference
Sayari offers all graph related resources through a REST api, accepting either query-stings or JSON encoded request bodies returning the resource through standard HTTP verbs and content negotiation.
# Authentication
Authentication to the API is performed via JWT access tokens. To make API calls, you'll need to set the following variables in order to obtain a token.
CLIENT_ID
CLIENT_SECRET
The bearer token will then be granted by requesting the token resource.
curl --request POST \
--url https://api.sayari.com/oauth/token \
--header 'content-type: application/json' \
--data '{
"client_id": $CLIENT_ID,
"client_secret": $CLIENT_SECRET,
"audience":"sayari.com",
"grant_type":"client_credentials"
}'
{
"access_token": "sk_test_4eC39HqLyjWDarjtT1zdp7dc",
"token_type": "Bearer"
}
This token will expire in 24 hours, at which point a new token should be requested.
To use the token to authenticate HTTP requests against the Sayari API, pass the bearer token in the request's Authorization
header. For example, using the token retrieved above:
curl 'https://api.sayari.com/search/entity?q=china' -H"Authorization: Bearer sk_test_4eC39HqLyjWDarjtT1zdp7dc"
# Requests
Sayari utilizes standard HTTP verbs to indicate request intent. All resources can be requested using GET
requests with request parameters serialized in the url query string, e.g. /entity/:id?referenced_by.offset=50&attributes.address.limit=10
.
# Pagination
Response fields that represent unbounded collections, such as a search result or an entity's attributes or relationships, or a record's references, can all be paginated in cases where the collection is larger than can be efficiently returned in a single request. Paginated requests take one of two forms: token pagination, for the entity endpoint, and offset pagination, for all other endpoints.
# Token Pagination
Token paginated requests specify optional next
or prev
parameters to retrieve the next or previous page of results. Omitting both token parameters will return the first page of results, which may contain a next
token in the response body if there are more results to return. Result pages beyond the first will include a prev
token in the response body, allowing for backwards pagination. The limit parameter will restrict the size of the page, up to whatever default is set for that result type.
# Arguments
- next: string optional
- Token to retrieve the next page of results
- prev: string optional
- Token to retrieve the previous page of results
- limit: integer optional
- A limit on the number of objects to be returned. Defaults to 100.
# Offset Pagination
Offset paginated requests specify an optional offset
and limit
parameter for the paginatable list. The response data will include a next
boolean field to indicate if the collection is fully paginated, and optionally may also include a size
key with a count indicating the total length of the collection. Collections whose size is not feasible to compute, such as traversals, will not have a size
key.
# Arguments
- limit: integer optional
- A limit on the number of objects to be returned. Defaults to 100.
- offset: integer optional
- Number of results to skip before returning response. Defaults to 0.
# Paginating Search
The search endpoint is paginated via offsets in cases where the search result size is greater than the page limit size. For very large result sets, the search count may be estimated.
GET /search/entity?q=china&offset=10&limit=5 HTTP/1.1
Accept: application/json
{
"offset": 10,
"limit": 5,
"next": true,
"size": {
"count": 10000,
"qualifier": "gte"
},
"data": [...]
}
# Paginating Entities
An Entity's attributes, relationships, possibly same as entities, and record references can all be paginated via tokens.
Pagination next or prev tokens are passed as query parameters with any combination of the following:
attributes.[field].['next' | 'prev']=[string]
, e.g.?attributes.address.next=qr7bvn2
relationships.['next' | 'prev']=[string]
, e.g.?relationships.next=qr7bvn2
possibly_same_as.['next' | 'prev']=[string]
, e.g.?possibly_same_as.next=qr7bvn2
referenced_by.['next' | 'prev']=[string]
, e.g.?referenced_by.next=qr7bvn2
For example, this HTTP request could return the following result:
GET /entity/abc?attributes.name.limit=10&relationships.next=qr7bvn2&relationships.limit=150 HTTP/1.1
Accept: application/json
{
...
"attributes": {
"name": {
"next": "y4rkp09",
"limit": 10,
"size": {
"count": 18,
"qualifier": "eq"
},
"data": [...]
}
},
"relationships": {
"next": "w98vmfd",
"prev": "myvc64l",
"limit": 150,
"size": {
"count": 1201,
"qualifier": "eq"
},
"data": [...]
},
"possibly_same_as": {
"limit": 100,
"size": {
"count": 12,
"qualifier": "eq"
},
"data": [...]
},
"referenced_by": {
"next": "84ct7eb",
"limit": 100,
"size": {
"count": 300,
"qualifier": "eq"
},
"data": [...]
}
}
# Paginating Records
A Record's entity references can paginated via offsets in cases where the record cites more entities than can be returned in a single request.
For example, this HTTP request could yield something like the following result:
GET /record/123?references.offset=100&references.limit=100 HTTP/1.1
Accept: application/json
{
...
"references": {
"next": false,
"offset": 100,
"limit": 100,
"size": {
"count": 140,
"qualifier": "eq"
},
"data": [...]
}
}
# Paginating Traversals
The traversal endpoints' response paths are paginated via offsets. Because the total number of potential matching paths is very expensive to compute, the response does not include size values. Instead, use the next
boolean field to determine if there are more pages of results to return.
GET /traversal/123?offset=20&max_depth=8 HTTP/1.1
Accept: application/json
{
"offset": 20,
"limit": 20,
"next": true,
"data": [...]
}
# Responses
# Success
All successful requests will be indicated with 2xx status codes.
Code | Response | Description |
---|---|---|
200 | OK | Successful GET request. |
201 | Created | Successful POST request. |
# Errors
All errors will be returned with the corresponding HTTP response status indicating the reason for a failed request.
Code | Response | Description |
---|---|---|
400 | Bad Request | Incorrectly formatted request. |
401 | Unauthorized | Request made without valid token. |
404 | Not Found | Resource not found or does not exist. |
405 | Method Not Allowed | Request made with an unsupported HTTP method. Currently only GET and POST supported. |
406 | Not Acceptable | Request made in an unacceptable state. This is most commonly due to parameter validation errors. |
415 | Unsupported Media Type | Accept header on request set to an unsupported media type. Currently only application/json and text/csv supported for indicated resources. |
429 | Rate Limited | Too many requests within too short of a period. The reply will contain a retry-after header that indicates when the client can safely retry. |
500 | Internal Server Error | Internal server error occurred. |
The error will also be indicated as a JSON object in the body in the following format:
{
"status": 500,
"success": false,
"messages": ["Internal Server Error"]
}
Validation messages on request parameters will also displayed in the messages field to give more information on the failed request.
# Types
# Dates
Date strings are formatted as YYYY[-MM[-DD]]
, meaning 2000-10-02
, 2000-10
, and 2000
are all valid date formats. Dates without day or month-day segments appear where the day or month is either not known or not relevant.
Entity attributes and Entity relationships may all have either a from_date
and/or to_date
field to indicate when an attribute or relationship value started or ended, as well as an optional generic date
field. Records also have date fields to indicate when the record was published (publication_date
), and when it was acquired by Sayari (acquisition_date
).
# Countries
Country Ids use the ISO 3166 Trigram (opens new window) country code.
# Source
{
"id": "b9dc2ca839c318d04910a8a680131fdf",
"label": "Albania Trade Register Extracts",
"country": "ALB"
}
# EmbeddedEntity
{
"id": "123",
"label": "ACME Co.",
"type": "company",
"entity_url": "/entity/123",
"identifiers": [{ "type": "uk_company_number", "value": "12345" }],
"countries": ["GBR"],
"closed": false,
"pep": false,
"sanctioned": false,
"psa_sanctioned": "123456",
"psa_count": 2,
"source_count": {
"some_source_id": {
"count": 2,
"label": "Some Source Label"
}
},
"degree": 304,
"addresses": ["32535 31st Rd, Arkansas City, KS, 67005"],
"date_of_birth": "1990-08-03",
"relationship_count": {
"has_shareholder": 300,
"shareholder_of": 4
}
}
# EmbeddedRecord
{
"id": "abc",
"label": "Some Record - 1/14/2020",
"source": "some_source_id",
"publication_date": "2019-02-04",
"acquisition_date": "2019-02-05",
"references_count": 2,
"record_url": "record/abc",
"source_url": "https://entity.com/company/12345"
}
# PathSegment
{
"field": "shareholder_of",
"relationships": {
"shareholder_of": {
"values": [
{
"record": "ecdfb3f2ecc8c3797e77d5795a8066ef/123567",
"attributes": {
"shares": [
{
"percentage": 100,
"monetary_value": 2100000,
"currency": "USD"
}
]
},
"date": "2018-06-14"
}
]
},
"director_of": {
"values": [
{
"record": "ecdfb3f2ecc8c3797e77d5795a8066ef/123567",
"attributes": {
"position": [{ "value": "Director" }]
},
"from_date": "2007-12-01",
"to_date": "2015-05-01",
"acquisition_date": "2021-04-14",
"publication_date": "2021-04-14"
}
],
"former": true
}
},
"entity": EmbeddedEntity
}
# PossiblySameAsMatches
{
"name": [
{
"target": "John Smith",
"source": "John C Smith"
}
],
"date_of_birth": [
{
"target": "1970-05-02",
"source": "1970-05"
}
]
}
# Resource Endpoints
# Entity
GET /v1/entity/:id
Accept: application/json
An entity represents a single real-world thing such as a person or company or land property that has been extracted from one or more records. The entity response includes information on that entity's attributes, relationships, as well as the records that entity is sourced to. Entity requests includes paginated lists of attributes and relationships, with limit defaulting to 100
. Paginate the lists of attributes and relationships using the next
or prev
token included in the response.
# Arguments
- attributes.[field].next: string optional
- The pagination token for the next page of attribute `[field]`, e.g. name, address, or country.
- attributes.[field].prev: string optional
- The pagination token for the previous page of attribute `[field]`, e.g. name, address, or country.
- attributes.[field].limit: integer optional
- Limit total values returned for attribute `[field]`. Defaults to 100.
- relationships.next: integer optional
- The pagination token for the next page of relationship results
- relationships.prev: integer optional
- The pagination token for the previous page of relationship results
- relationships.limit: integer optional
- Limit total relationship values. Defaults to 100.
- relationships.type: integer optional
- Filter relationships to relationship type, e.g. director_of or has_shareholder
- relationships.sort: string optional
- Sorts relationships by As Of date or Shareholder percentage, e.g. date or -shares
- relationships.startDate: date optional
- Filters relationships to after a date
- relationships.endDate: date optional
- Filters relationships to before a date
- relationships.minShares: integer optional
- Filters relationships to greater than or equal to a Shareholder percentage
- relationships.country: string[] optional
- Filters relationships to a list of countries
- relationships.arrivalCountry: string[] optional
- Filters shipment relationships to a list of arrival countries
- relationships.departureCountry: string[] optional
- Filters shipment relationships to a list of departure countries
- relationships.hsCode: string optional
- Filters shipment relationships to an HS code
- possibly_same_as.next: integer optional
- The pagination token for the next page of possibly same entities.
- possibly_same_as.prev: integer optional
- The pagination token for the previous page of possibly same entities.
- possibly_same_as.limit: integer optional
- Limit total possibly same as entities. Defaults to 100.
- referenced_by.next: integer optional
- The pagination token for the next page of the entity's referencing records
- referenced_by.prev: integer optional
- The pagination token for the previous page of the entity's referencing records
- referenced_by.limit: integer optional
- Limit totals values returned for entity's referencing records. Defaults to 100.
# Content Types
Supported content types include:
- application/json
- JSON response.
- text/csv
- CSV response.
- application/pdf
- PDF response.
- application/vnd.ms-excel
- Microsoft Excel XLSX response.
# Examples
GET /v1/entity/123?attributes.name.limit=10&relationships.next=wkdjtrsdre HTTP/1.1
Accept: application/json
{
"id": "123",
"label": "ACME Co.",
"type": "company",
"entity_url": "/v1/entity/123",
"identifiers": [{ "type": "uk_company_number", "value": "12345" }],
"countries": ["GBR"],
"source_count": {
"some_source_id": {
"count": 2,
"label": "UK Companies House",
}
},
"relationship_count": {
"has_shareholder": 300
},
"attributes": {
"name": {
"next": "b9dc2ca839c31",
"limit": 100,
"size": {
"count": 4,
"qualifier": "eq"
},
"data": [
{
"properties": {
"value": "Acme Co."
},
"record": ["record_id"],
"record_count": 3
},
...
]
},
...
},
"relationships": {
"next": "b9dc2ca839c31",
"prev": "49dc2ca839c31",
"limit": 100,
"size": {
"count": 300,
"qualifier": "eq"
},
"data": [
{
"target": EmbeddedEntity,
"types": {
"shareholder_of": [
{
"record": "record-id",
"attributes": {
"shares": [
{ "percentage": 100, "monetary_value": 2100000, "currency": "USD" }
]
},
"date": "2018-06-14"
}
]
}
},
...
]
},
"possibly_same_as": {
"next": "b9dc2ca839c31",
"limit": 100,
"size": {
"count": 2,
"qualifier": "eq"
},
"data": [
{
"entity": EmbeddedEntity,
"matches": PossiblySameAsMatches
},
...
]
},
"referenced_by": {
"next": "b9dc2ca839c31",
"limit": 100,
"size": {
"count": 3,
"qualifier": "eq"
},
"data": [
{
"record": EmbeddedRecord,
"type": "about"
},
...
]
}
}
# Entity Summary
GET /v1/entity_summary/:id
Accept: application/json
The Entity Summary endpoint returns a smaller entity payload, including:
- up to 50 values for each of the following attributes:
name
,address
,identifier
,weak_identifier
,status
,company_type
,contact
,business_purpose
,country
- up to 50 entities that are possibly the same as the target entity
- up to 100 records the entity is sourced to
# Content Types
Supported content types include:
- application/json
- JSON encoded response.
# Examples
GET /record/123 HTTP/1.1
Accept: application/json
{
"id": "123",
"label": "ACME Corp.",
"degree": 3,
"risk": {
"basel_aml": {
"value": 4.63,
"metadata": { "country": ["USA"] }
},
"cpi_score": {
"value": 67,
"metadata": { "country": ["USA"] }
},
"sanctioned_distance": {
"value": 3,
"metadata": { }
}
},
"psa_count": 1,
"type": "company",
"entity_url": "/v1/entity/123",
"identifiers": [{ "type": "uk_company_number", "value": "1234" }],
"countries": ["GBR"],
"source_count": {
"ecdfb3f2ecc8c3797e77d5795a8066ef": {
"count": 6,
"label": "UK Corporate Registry"
},
"2a4fe9a14e332c8f9ded1f8a457c2b89": {
"count": 12,
"label": "UK Land Commercial and Corporate Ownership Data (CCOD)"
},
},
"relationship_count": {
"has_registered_agent": 1,
"has_shareholder": 2,
"linked_to": 2,
},
"attributes": {
"identifier": {
"offset": 0,
"limit": 20,
"next": false,
"size": {
"count": 1,
"qualifier": "eq"
},
"data": [
{
"properties": {
"value": "1234",
"type": "uk_company_number"
},
"record": ["abc"],
"record_count": 21
}
]
},
...
}
}
# Traversal
GET /v1/traversal/:id
Accept: application/json
The Traversal endpoint returns paths from a single target entity to up to 50 directly or indirectly-related entities. Each path includes information on the 0 to 10 intermediary entities, as well as their connecting relationships. The response's explored_count
field indicates the size of the graph subset the application searched. Running a traversal on a highly connected entity with a restrictive set of argument filters and a high max depth will require the application to explore a higher number of traversal paths, which may affect performance.
# Arguments
- offset: integer optional
- Offset values for traversal. Defaults to 0.
- limit: integer optional
- Limit total values for traversal. Defaults to 20.
- min_depth: integer optional
- Set minimum depth for traversal. Defaults to 1.
- max_depth: integer optional
- Set maximum depth for traversal. Defaults to 6.
- relationships: string, string[] optional
- Set relationship type(s) to follow when traversing related entities. Defaults to following all relationship types.
- psa: boolean optional
- Also traverse relationships from entities that are possibly the same as any entity that appears in the path. Defaults to not traversing possibly same as relationships.
- countries: string, string[] optional
- Filter paths to only those that end at an entity associated with the specified country(ies). Defaults to returning paths that end in any country.
- types: string, string[] optional
- Filter paths to only those that end at an entity of the specified type(s). Defaults to returning paths that end at any type.
- sanctioned: boolean optional
- Filter paths to only those that end at an entity appearing on a watchlist. Defaults to not filtering paths by sanctioned status.
- pep: boolean optional
- Filter paths to only those that end at an entity appearing on a pep list. Defaults to not filtering paths by pep status.
- min_shares: integer optional
- Set minimum percentage of share ownership for traversal. Defaults to 0.
- include_unknown_shares: boolean optional
- Also traverse relationships when share percentages are unknown. Only useful when min_shares is set greater than 0. Defaults to true.
- exclude_former_relationships: boolean optional
- Include relationships that were valid in the past but not at the present time. Defaults to false.
- exclude_closed_entities: boolean optional
- Include entities that existed in the past but not at the present time. Defaults to false.
- eu_high_risk_third: boolean optional
- reputational_risk_modern_slavery: boolean optional
- state_owned: boolean optional
- formerly_sanctioned: boolean optional
- reputational_risk_terrorism: boolean optional
- reputational_risk_organized_crime: boolean optional
- reputational_risk_financial_crime: boolean optional
- reputational_risk_bribery_and_corruption: boolean optional
- reputational_risk_other: boolean optional
- reputational_risk_cybercrime: boolean optional
- regulatory_action: boolean optional
- law_enforcement_action: boolean optional
- xinjiang_geospatial: boolean optional
The additional risk filters below will filter paths to only those that entity with an entity that we have flagged with the corresponding risk factor. Details about what these risk factors indicate can be found here.
# Examples
GET /v1/traversal/123?max_depth=6&relationships=has_shareholder&relationships=branch_of&psa HTTP/1.1
Accept: application/json
{
"min_depth": 1,
"max_depth": 6,
"relationships": ["has_shareholder", "branch_of"],
"countries": [],
"types": [],
"psa": true,
"offset": 0,
"limit": 20,
"next": true,
"data": [
{
"target": EmbeddedEntity,
"path": PathSegment[]
},
...
],
"explored_count": 1201
}
# UBO
GET /v1/ubo/:id
Accept: application/json
The UBO endpoint returns paths from a single target entity to up to 50 beneficial owners. The endpoint is a shorthand for the equivalent traversal query:
GET /v1/traversal/:id?relationships=has_shareholder&relationships=has_beneficial_owner&relationships=has_owner&relationships=subsidiary_of&relationships=branch_of
# Arguments
See Traversal
# Examples
See Traversal
# Ownership
GET /v1/downstream/:id
Accept: application/json
The Ownership endpoint returns paths from a single target entity to up to 50 entities directly or indirectly owned by that entity. The endpoint is a shorthand for the equivalent traversal query:
GET /v1/traversal/:id?relationships=shareholder_of&relationships=beneficial_owner_of&relationships=owner_of&relationships=has_subsidiary&relationships=has_branch
# Arguments
See Traversal
# Examples
See Traversal
# Watchlist
GET /v1/watchlist/:id
Accept: application/json
The Watchlist endpoint returns paths from a single target entity to up to 50 other entities that appear on a watchlist or are peps. The endpoint is a shorthand for the equivalent traversal query:
GET /v1/traversal/:id?watchlist
# Arguments
See Traversal
# Examples
See Traversal
# Shortest Path
GET /v1/shortest_path
Accept: application/json
The Shortest Path endpoint returns a response identifying the shortest traversal path connecting each pair of entities.
# Arguments
- entities: string[]
- Entity ids
# Examples
GET /v1/shortest_path?entities=123&entities=345 HTTP/1.1
Accept: application/json
{
"entities": [
"123",
"345"
],
"data": [
{
"target": EmbeddedEntity,
"path": PathSegment[]
}
]
}
# Record
GET /v1/record/:id
Accept: application/json
Records represent the documents used to source entities. The record resource includes metadata about the document itself as well as the entities extracted from the document.
# Arguments
- references.limit: integer optional
- A limit on the number of references to be returned. Defaults to 100.
- references.offset: integer optional
- Number of references to skip before returning response. Defaults to 0.
# Content Types
Supported content types include:
- application/json
- JSON encoded response.
# Examples
GET /v1/record/:id HTTP/1.1
Accept: application/json
{
"id": "1",
"label": "foo",
"source_url": "/source/1",
"publication_date": "2018-02-29",
"acquisition_date": "2018-06-13",
"references_count": 3,
"record_url": "/record/1",
"document_urls": [
"/document/1/file/company-html.html",
],
"references": {
"next": false,
"offset": 0,
"limit": 100,
"size": {
"count": 1,
"qualifier": "eq"
},
"data": EmbeddedEntity[]
}
}
# Fields
- id: string
- Internal ID of the associated record.
- label: string
- Human readable label for the record.
- source_url: string
- Url to the source associated with the record.
- publication_date: date
- Date of record publication.
- acquisition_date: date
- The date Sayari acquired the source document.
- references_count: integer
- Count of entities referenced in source document.
- record_url: string
- Url to current record.
- document_urls: string array optional
- A list of document urls for downloading the underlying source document. If omitted then source document is not able to be downloaded.
- references: paginated search entity
- A list of embedded entity references with meta indicating the type of reference.
# Reference Types
An about
reference gives a list of entities that the record is specifically about. In most cases there will be a single entity of this type.
The mentions
reference is the other reference type and corresponds to another entity mentioned in the record.
# Search
GET /v1/search/[entity|record]
Accept: application/json
The search endpoint allows for the ability to search Sayari internal data using text queries. Both entity and records have associated search endpoints with common arguments.
# Arguments
- q: string
- Query term. The syntax for the query parameter follows elasticsearch simple query string syntax. The includes the ability to use search operators and to perform nested queries. Must be url encoded.
- filter: Filter array optional
- Filters to be applied to search query to limit the result-set. Set as an object
{"filter_key": "value"}
wherefilter_key
can be one of source, country, state, city, entity_type, bounds, or risk. Serialized to query string if not sent in POST request.
Thecountry
filter accepts ISO 3166 codes andstate
accepts abbreviations.
Thebounds
filter is a pipe-delimited (|
) string of four numeric values representing a bounding box: the north latitude, west longitude, south latitude, and east longitude bounds. For example:46.12|-76|45|-75
- fields: string array
- Record or entity fields to search against.
- facets: boolean optional
- Whether or not to return search facets in results giving counts by field. Defaults to false.
- geo_facets: boolean optional
- Whether or not to return search geo bound facets in results giving counts by geo tile. Defaults to false.
- advanced: boolean optional
- Set to true to enable full elasticsearch query string syntax which allows for fielded search and more complex operators. Note that the syntax is more strict and can result in empty result-sets. Defaults to false.
- limit: integer optional
- A limit on the number of objects to be returned with a range between 1 and 100. Defaults to 100.
- offset: integer optional
- Number of results to skip before returning response. Defaults to 0.
# Content Types
Supported content types include:
- application/json
- JSON encoded response.
- text/csv
- CSV string representation of response.
# Examples
To make a simple request use the q
parameter to search against a query. The following are equivalent:
GET /v1/search/record?q=test HTTP/1.1
Accept: application/json
POST /v1/search/record HTTP/1.1
Accept: application/json
{
"q": "test"
}
WARNING
All search arguments passed through the query string of the GET request must be url encoded. For example foo OR (bar~5)
becomes foo%20OR%20(bar~5)
.
# Fields
To search against a specific field, use the fields
parameter in the request. This request will search addresses and names for apple
.
GET /v1/search/record?q=apple&fields=address&fields=name HTTP/1.1
Accept: application/json
POST /v1/search/record HTTP/1.1
Accept: application/json
{
"q": "test",
"fields": ["address", "name"]
}
The following fields are supported:
- name
- identifier
- address
- business_purpose
- date_of_birth
- contact
# Filters
Filters are parameters that limit the result-set post query. They can have different representations depending on request type. If using a GET
request, filters can be added to the query string as an array with the format filter=id
. For example, a filter against sources for UK Corporate Registry which has the id ecdfb3f2ecc8c3797e77d5795a8066ef would look like filters=source%3Decdfb3f2ecc8c3797e77d5795a8066ef
. Alternatively, the filter can be expressed in the body of the request as a Filter object which looks like the following:
{
"filters": {
"source": ["ecdfb3f2ecc8c3797e77d5795a8066ef"]
}
}
Notice the array type indicating that the query can be filtered by multiple sources. An example filtered request by multiple sources would appear as follows:
GET /v1/search/entity?filters=source=ecdfb3f2ecc8c3797e77d5795a8066ef&filters=source=4ea8bac1bed868e1510ffd21842e9551&q=example HTTP/1.1
Accept: application/json
POST /v1/search/entity HTTP/1.1
Accept: application/json
{
"q": "test",
"filters": {
"source": [
"ecdfb3f2ecc8c3797e77d5795a8066ef",
"4ea8bac1bed868e1510ffd21842e9551"
]
}
}
The following fields support filtering:
- entity_type
- source
- country
- risk
Supported arguments for risk filters include all boolean risk fields documented in the ontology here.
# Facets
A search response can display the facet results for filtering fields depending on the request. If the facets
argument is applied then they will be provided in the response. For example, using the following request:
GET /v1/search/record?q=example&facets=true HTTP/1.1
Accept: application/json
The responses will look like:
{
"offset": 0,
"limit": 1,
"next": true,
"size": {
"count": 10000,
"qualifier": "gte"
},
"data": [
{
"id": "1",
"label": "Foo Bar",
"source_url": "/source/1",
"publication_date": "2018-05-23",
"acquisition_date": "2018-02-12",
"references_count": 3,
"record_url": "/record/1",
"matches": {
"address": ["Foo <em>example</em> Bar"]
}
}
],
"facets": {
"country": [
{
"key": "TUR",
"label": "Turkey",
"doc_count": 37701
},
{
"key": "GBR",
"label": "United Kingdom",
"doc_count": 26706
},
{
"key": "USA",
"label": "United States",
"doc_count": 10813
}
],
"entity_type": [
{
"key": "company",
"doc_count": 36720
},
{
"key": "property",
"doc_count": 22484
}
],
"source": [
{
"key": "123465646464",
"doc_count": 435631
},
{
"key": "999596322424",
"doc_count": 21078
}
],
"source_type": [
{
"key": "Company data",
"doc_count": 441718
},
{
"key": "Litigation data",
"doc_count": 28980
},
{
"key": "Other",
"doc_count": 169
}
],
"region": [
{
"key": "USA & Canada",
"doc_count": 435980
},
{
"key": "Asia Pacific",
"doc_count": 26014
},
{
"key": "Latin America & Caribbean",
"doc_count": 4099
}
]
}
}
# GeoFacets
A search response can display the geo_facet results for filtering fields depending on the request. If the facets
and the geo_facets
arguments is applied then they will be provided in the response. For example, using the following request:
GET /v1/search/record?q=example&facets=true&geo_facets=true HTTP/1.1
Accept: application/json
The responses will look like:
{
"offset": 0,
"limit": 1,
"next": true,
"size": {
"count": 10000,
"qualifier": "gte"
},
"data": [
{
"id": "1",
"label": "Foo Bar",
"source_url": "/source/1",
"publication_date": "2018-05-23",
"acquisition_date": "2018-02-12",
"references_count": 3,
"record_url": "/record/1",
"matches": {
"address": ["Foo <em>example</em> Bar"]
}
}
],
"facets": {
"location": [
{
"key": "4/4/6",
"doc_count": 12500691
},
{
"key": "4/13/6",
"doc_count": 9194167
}
],
"country": [
{
"key": "TUR",
"label": "Turkey",
"doc_count": 37701
},
{
"key": "GBR",
"label": "United Kingdom",
"doc_count": 26706
},
{
"key": "USA",
"label": "United States",
"doc_count": 10813
}
],
"entity_type": [
{
"key": "company",
"doc_count": 36720
},
{
"key": "property",
"doc_count": 22484
}
],
"source": [
{
"key": "123465646464",
"doc_count": 435631
},
{
"key": "999596322424",
"doc_count": 21078
}
],
"source_type": [
{
"key": "Company data",
"doc_count": 441718
},
{
"key": "Litigation data",
"doc_count": 28980
},
{
"key": "Other",
"doc_count": 169
}
],
"region": [
{
"key": "USA & Canada",
"doc_count": 435980
},
{
"key": "Asia Pacific",
"doc_count": 26014
},
{
"key": "Latin America & Caribbean",
"doc_count": 4099
}
]
}
}
# Advanced Search
The search endpoint gives the ability for users to specify their query in full elasticsearch query string syntax (opens new window). This allows for fielded search with more advanced operators than available with the default query syntax. The following advanced query for name = foo, identifier = bar
can be expressed in the following request with the advanced
parameter is set to true and the query set to the uri-encoded equivalent of (name.value:foo OR identifier.value:bar)
.
GET /v1/search/entity?q=(name.value%3Afoo%20OR%20identifier.value%3Abar)&advanced=true HTTP/1.1
Accept: application/json, text/javascript
Alternatively, to search for an identifier of a specific type, use the url-encoded equivalent of (identifier.value:123 AND identifier.type:uk_company_number)
. The list of available fields is as follows:
name.value
identifier.value
,identifier.type
. For a list of possible identifier types, see the ontology documentation for identifier typesaddress.value
business_purpose.value
date_of_birth.value
contact.value
,contact.type
. For a list of possible contact types, see the ontology documentation for contact types
# Entity Search
To search against sayari entities use the /search/entity
endpoint. The will give a paginated response of Embedded Entities along with the fields that the search matched against.
Request:
GET /v1/search/entity?q=example&limit=1 HTTP/1.1
Accept: application/json, text/javascript
Response:
{
"offset": 0,
"limit": 1,
"next": true,
"size": {
"count": 10000,
"qualifier": "gte"
},
"data": [
{
"id": "1",
"label": "example company",
"type": "company",
"entity_url": "/entity/1",
"identifiers": [
{
"type": "test_identifier",
"value": "1"
}
],
"countries": ["VEN"],
"source_count": {
"1122": {
"count": 63,
"label": "UK Companies House"
}
},
"relationship_count": {
"has_employee": 63
},
"matches": {
"name": ["<em>example company</em>"]
}
}
]
}
# Record Search
To search against sayari records and documents use the /v1/search/record
endpoint. The will give a paginated response of Embedded Records along with the fields that the search matched against.
Request:
GET /v1/search/record?q=test&limit=1 HTTP/1.1
Accept: application/json, text/javascript
Response:
{
"offset": 0,
"limit": 1,
"next": true,
"size": {
"count": 10000,
"qualifier": "gte"
},
"data": [
{
"id": "1",
"label": "Foo Bar",
"source_url": "/source/1",
"publication_date": "2018-05-23",
"crawl_date": "2018-02-12",
"acquisition_date": 3,
"record_url": "/record/1",
"matches": {
"address": ["Foo <em>example</em> Bar"]
}
}
]
}
# Resolution
The resolution endpoints allow users to search for matching entities against a provided list of attributes. The endpoint is similar to the search endpoint, except it's tuned to only return the best match so the client doesn't need to do as much or any post-processing work to filter down results. Supported attributes can include one or more of the below fields as well as the identifier type enums listed here:
- name
- identifier
- country
- address
- date_of_birth
- contact
- type
If multiple values are passed for any field, the endpoint will match entities with ANY of the values. Available type
values are listed here. Available country
values are any ISO 3166-1 alpha-3 country codes listed here.
# GET
When making a GET request, pass attribute values as query string parameters:
Request
GET /v1/resolution?name=Institute of the Russian Diaspora (AKA Institute of Russian Abroad)&name=ИНСТИТУТ РУССКОГО ЗАРУБЕЖЬЯ&identifier=1057746409379&ru_inn=7727536630&country=rus HTTP/1.1
Accept: application/json
Response:
{
"fields": {
"name": [
"Institute of the Russian Diaspora (AKA Institute of Russian Abroad)",
"ИНСТИТУТ РУССКОГО ЗАРУБЕЖЬЯ"
],
"identifier": ["1057746409379"],
"ru_inn": ["7727536630"],
"country": ["rus"]
},
"data": [
{
"entity_id": "3N07LU74r3J_pO9HeGivAQ",
"label": "АВТОНОМНАЯ НЕКОММЕРЧЕСКАЯ ОРГАНИЗАЦИЯ \"ИНСТИТУТ РУССКОГО ЗАРУБЕЖЬЯ\"",
"type": "company",
"identifiers": [
{
"type": "ru_ogrn",
"value": "1057746409379",
"label": "Ru Ogrn"
},
{
"type": "ru_inn",
"value": "7727536630",
"label": "Ru Inn"
},
{
"type": "ru_kpp",
"value": "772701001",
"label": "Ru Kpp"
}
],
"addresses": [
"117218, ГОРОД МОСКВА, УЛИЦА КРЖИЖАНОВСКОГО, 13, СТР.2",
"117218, ГОРОД МОСКВА, УЛ. КРЖИЖАНОВСКОГО, Д.13, СТР.2",
"117218 ГОРОД МОСКВА УЛИЦА КРЖИЖАНОВСКОГО 13 СТР.2",
"117218, г Москва, улица Кржижановского, 13 СТР.2"
],
"countries": ["RUS"],
"matched_queries": ["identifier", "country", "name"],
"highlight": {
"name": [
"Autonomous Non-profit Organization <em>Institute</em> <em>Of</em> <em>The</em> <em>Russian</em> <em>Abroad</em>"
],
"identifier": ["<em>7727536630</em>", "<em>1057746409379</em>"],
"country": ["<em>RUS</em>"]
}
}
]
}
# POST
The POST endpoint expects a JSON object of attribute and value array pairs.
Request:
POST /v1/resolution HTTP/1.1
Content-Type: application/json
Accept: application/json
{
"name": [
"Institute of the Russian Diaspora (AKA Institute of Russian Abroad)",
"ИНСТИТУТ РУССКОГО ЗАРУБЕЖЬЯ"
],
"identifier": ["1057746409379"],
"ru_inn": ["7727536630"],
"country": ["rus"]
}
Response:
{
"fields": {
"name": [
"Institute of the Russian Diaspora (AKA Institute of Russian Abroad)",
"ИНСТИТУТ РУССКОГО ЗАРУБЕЖЬЯ"
],
"identifier": ["1057746409379"],
"ru_inn": ["7727536630"],
"country": ["rus"]
},
"data": [
{
"entity_id": "3N07LU74r3J_pO9HeGivAQ",
"label": "АВТОНОМНАЯ НЕКОММЕРЧЕСКАЯ ОРГАНИЗАЦИЯ \"ИНСТИТУТ РУССКОГО ЗАРУБЕЖЬЯ\"",
"type": "company",
"identifiers": [
{
"type": "ru_ogrn",
"value": "1057746409379",
"label": "Ru Ogrn"
},
{
"type": "ru_inn",
"value": "7727536630",
"label": "Ru Inn"
},
{
"type": "ru_kpp",
"value": "772701001",
"label": "Ru Kpp"
}
],
"addresses": [
"117218, ГОРОД МОСКВА, УЛИЦА КРЖИЖАНОВСКОГО, 13, СТР.2",
"117218, ГОРОД МОСКВА, УЛ. КРЖИЖАНОВСКОГО, Д.13, СТР.2",
"117218 ГОРОД МОСКВА УЛИЦА КРЖИЖАНОВСКОГО 13 СТР.2",
"117218, г Москва, улица Кржижановского, 13 СТР.2"
],
"countries": ["RUS"],
"matched_queries": ["identifier", "country", "name"],
"highlight": {
"name": [
"Autonomous Non-profit Organization <em>Institute</em> <em>Of</em> <em>The</em> <em>Russian</em> <em>Abroad</em>"
],
"identifier": ["<em>7727536630</em>", "<em>1057746409379</em>"],
"country": ["<em>RUS</em>"]
}
}
]
}
# Sources
The sources
endpoint returns metadata for all sources that Sayari collects data from.
GET /v1/source/:id
Accept: application/json
GET /v1/sources
Accept: application/json
# Arguments
- limit: integer optional
- A limit on the number of objects to be returned with a range between 1 and 100. Defaults to 100.
- offset: integer optional
- Number of results to skip before returning response. Defaults to 0.
# Content Types
Supported content types include:
- application/json
- JSON encoded response.
# Source by ID
GET /v1/source/b9dc2ca839c318d04910a8a680131fdf HTTP/1.1
Accept: application/json, text/javascript
{
"id": "b9dc2ca839c318d04910a8a680131fdf",
"label": "Albania Trade Register Extracts",
"country": "ALB"
}
# List all Sources
GET /v1/sources?limit=10 HTTP/1.1
Accept: application/json, text/javascript
{
"offset": 0,
"limit": 2,
"size": {
"count": 124,
"qualifier": "eq"
},
"next": true,
"data": [
{
"id": "b9dc2ca839c318d04910a8a680131fdf",
"label": "Albania Trade Register Extracts",
"country": "ALB"
},
{
"id": "4ea8bac1bed868e1510ffd21842e9551",
"label": "Albania Trade Register Bulletins",
"country": "ALB"
}
]
}
# Info Endpoints
# Usage
The usage
endpoint provides a simple interface to retrieve information on usage made by your API account. This includes both views per API path and credits consumed. The time period for the usage query is also specified in the response and whether or not this includes total usage.
GET /usage
Accept: application/json
# Arguments
- from: date optional
- An ISO 8601 encoded date string indicating the starting time period to obtain usage stats. In the format YYYY-MM-DD
- to: date optional
- An ISO 8601 encoded date string indicating the ending time period to obtain usage stats. In the format YYYY-MM-DD
# Content Types
Supported content types include:
- application/json
- JSON encoded response.
# Examples
GET /usage
{
"usage": {
"get_entity": 1,
"search": 1,
"search_entities": 1,
"search_records": 1
},
"from": "2020-05-22T22:22:22.222Z",
"to": "2020-06-02T12:22:22.222Z"
}
Note: If usage states were not aggregated on the specified date the next closest date will be provided.
GET /usage?from=2020-05-21&to=2021-05-28
{
"usage": {
"get_entity": 5,
"search": 2,
"search_entities": 4,
"search_records": 5
},
"from": "2020-05-21T00:00:00Z",
"to": "2021-05-28T00:00:00Z"
}
# History
The history
endpoint return a user's event history
GET /history
Accept: application/json
# Arguments
- events: string[] optional
- The type of events to filter on.
- from: date optional
- An ISO 8601 encoded date string indicating the starting time period for the events. In the format YYYY-MM-DD
- to: date optional
- An ISO 8601 encoded date string indicating the ending time period for the events. In the format YYYY-MM-DD
- size: integer optional
- Size to limit number of events returned
- token: string optional
- Pagination token to retrieve the next page of results
# Content Types
Supported content types include:
- application/json
- JSON encoded response.
# Examples
GET /history?from=2020-02-22&to=2021-05-29
{
"events": [
{
"user": "auth0|abcd",
"environment": "production",
"event": "add_to_graph",
"data": {
"countries": [
"UKR"
],
"email": "[email protected]",
"groups": [
"Group-A"
],
"ip": 0,
"level": "info",
"message": "add_to_graph",
"size": 51,
"sources": [
"defddbef5048d6"
]
},
"timestamp": "2021-05-20T00:13:14.307Z"
},
...
]
}