# API Reference

Sayari offers all graph related resources through a REST api, accepting either query-stings or JSON encoded request bodies returning the resource through standard HTTP verbs and content negotiation.

# Authentication

Authentication to the API is performed via JWT access tokens. To make API calls, you'll need to set the following variables in order to obtain a token.

  • CLIENT_ID
  • CLIENT_SECRET

The bearer token will then be granted by requesting the token resource.

curl --request POST \
  --url https://api.sayari.com/oauth/token \
  --header 'content-type: application/json' \
  --data '{
    "client_id": $CLIENT_ID,
    "client_secret": $CLIENT_SECRET,
    "audience":"sayari.com",
    "grant_type":"client_credentials"
  }'
{
  "access_token": "sk_test_4eC39HqLyjWDarjtT1zdp7dc",
  "expires_in": 86400,
  "token_type": "Bearer"
}

This token will expire in 24 hours, at which point a new token should be requested.

To use the token to authenticate HTTP requests against the Sayari API, pass the bearer token in the request's Authorization header. For example, using the token retrieved above:

curl 'https://api.sayari.com/search/entity?q=china' -H"Authorization: Bearer sk_test_4eC39HqLyjWDarjtT1zdp7dc"

# Requests

Sayari utilizes standard HTTP verbs to indicate request intent. All resources can be requested using GET requests with request parameters serialized in the url query string, e.g. /entity/:id?referenced_by.offset=50&attributes.address.limit=10.

# Pagination

Response fields that represent unbounded collections, such as a search result or an entity's attributes or relationships, or a record's references, can all be paginated in cases where the collection is larger than can be efficiently returned in a single request. Paginated requests take one of two forms: token pagination, for the entity endpoint, and offset pagination, for all other endpoints.

# Token Pagination

Token paginated requests specify optional next or prev parameters to retrieve the next or previous page of results. Omitting both token parameters will return the first page of results, which may contain a next token in the response body if there are more results to return. Result pages beyond the first will include a prev token in the response body, allowing for backwards pagination. The limit parameter will restrict the size of the page, up to whatever default is set for that result type.

# Arguments

next: string optional
Token to retrieve the next page of results

prev: string optional
Token to retrieve the previous page of results

limit: integer optional
A limit on the number of objects to be returned. Defaults to 100.

# Offset Pagination

Offset paginated requests specify an optional offset and limit parameter for the paginatable list. The response data will include a next boolean field to indicate if the collection is fully paginated, and optionally may also include a size key with a count indicating the total length of the collection. Collections whose size is not feasible to compute, such as traversals, will not have a size key.

# Arguments

limit: integer optional
A limit on the number of objects to be returned. Defaults to 100.

offset: integer optional
Number of results to skip before returning response. Defaults to 0.

The search endpoint is paginated via offsets in cases where the search result size is greater than the page limit size. For very large result sets, the search count may be estimated.

GET /search/entity?q=china&offset=10&limit=5 HTTP/1.1
Accept: application/json
{
    "offset": 10,
    "limit": 5,
    "next": true,
    "size": {
        "count": 10000,
        "qualifier": "gte"
    },
    "data": [...]
}

# Paginating Entities

An Entity's attributes, relationships, possibly same as entities, and record references can all be paginated via tokens.

Pagination next or prev tokens are passed as query parameters with any combination of the following:

  • attributes.[field].['next' | 'prev']=[string], e.g. ?attributes.address.next=qr7bvn2
  • relationships.['next' | 'prev']=[string], e.g. ?relationships.next=qr7bvn2
  • possibly_same_as.['next' | 'prev']=[string], e.g. ?possibly_same_as.next=qr7bvn2
  • referenced_by.['next' | 'prev']=[string], e.g. ?referenced_by.next=qr7bvn2

For example, this HTTP request could return the following result:

GET /entity/abc?attributes.name.limit=10&relationships.next=qr7bvn2&relationships.limit=150 HTTP/1.1
Accept: application/json
{
    ...
    "attributes": {
        "name": {
            "next": "y4rkp09",
            "limit": 10,
            "size": {
                "count": 18,
                "qualifier": "eq"
            },
            "data": [...]
        }
    },
    "relationships": {
        "next": "w98vmfd",
        "prev": "myvc64l",
        "limit": 150,
        "size": {
            "count": 1201,
            "qualifier": "eq"
        },
        "data": [...]
    },
    "possibly_same_as": {
        "limit": 100,
        "size": {
            "count": 12,
            "qualifier": "eq"
        },

        "data": [...]
    },
    "referenced_by": {
        "next": "84ct7eb",
        "limit": 100,
        "size": {
            "count": 300,
            "qualifier": "eq"
        },
        "data": [...]
    }
}

# Paginating Records

A Record's entity references can paginated via offsets in cases where the record cites more entities than can be returned in a single request.

For example, this HTTP request could yield something like the following result:

GET /record/123?references.offset=100&references.limit=100 HTTP/1.1
Accept: application/json
{
    ...
    "references": {
        "next": false,
        "offset": 100,
        "limit": 100,
        "size": {
            "count": 140,
            "qualifier": "eq"
        },
        "data": [...]
    }
}

# Paginating Traversals

The traversal endpoints' response paths are paginated via offsets. Because the total number of potential matching paths is very expensive to compute, the response does not include size values. Instead, use the next boolean field to determine if there are more pages of results to return.

GET /traversal/123?offset=20&max_depth=8 HTTP/1.1
Accept: application/json
{
    "offset": 20,
    "limit": 20,
    "next": true,
    "data": [...]
}

# Responses

# Success

All successful requests will be indicated with 2xx status codes.

Code Response Description
200 OK Successful GET request.
201 Created Successful POST request.

# Errors

All errors will be returned with the corresponding HTTP response status indicating the reason for a failed request.

Code Response Description
400 Bad Request Incorrectly formatted request.
401 Unauthorized Request made without valid token.
404 Not Found Resource not found or does not exist.
405 Method Not Allowed Request made with an unsupported HTTP method. Currently only GET and POST supported.
406 Not Acceptable Request made in an unacceptable state. This is most commonly due to parameter validation errors.
415 Unsupported Media Type Accept header on request set to an unsupported media type. Currently only application/json and text/csv supported for indicated resources.
429 Rate Limited Too many requests within too short of a period. The reply will contain a retry-after header that indicates when the client can safely retry.
500 Internal Server Error Internal server error occurred.

The error will also be indicated as a JSON object in the body in the following format:

{
  "status": 500,
  "success": false,
  "messages": ["Internal Server Error"]
}

Validation messages on request parameters will also displayed in the messages field to give more information on the failed request.

# Types

# Dates

Date strings are formatted as YYYY[-MM[-DD]], meaning 2000-10-02, 2000-10, and 2000 are all valid date formats. Dates without day or month-day segments appear where the day or month is either not known or not relevant.

Entity attributes and Entity relationships may all have either a from_date and/or to_date field to indicate when an attribute or relationship value started or ended, as well as an optional generic date field. Records also have date fields to indicate when the record was published (publication_date), and when it was acquired by Sayari (acquisition_date).

# Countries

Country Ids use the ISO 3166 Trigram (opens new window) country code.

# Source

{
  "id": "b9dc2ca839c318d04910a8a680131fdf",
  "label": "Albania Trade Register Extracts",
  "country": "ALB"
}

# EmbeddedEntity

{
  "id": "123",
  "label": "ACME Co.",
  "type": "company",
  "entity_url": "/entity/123",
  "identifiers": [{ "type": "uk_company_number", "value": "12345" }],
  "countries": ["GBR"],
  "closed": false,
  "pep": false,
  "sanctioned": false,
  "psa_sanctioned": "123456",
  "psa_count": 2,
  "source_count": {
    "some_source_id": {
      "count": 2,
      "label": "Some Source Label"
    }
  },
  "degree": 304,
  "addresses": ["32535 31st Rd, Arkansas City, KS, 67005"],
  "date_of_birth": "1990-08-03",
  "relationship_count": {
    "has_shareholder": 300,
    "shareholder_of": 4
  }
}

# EmbeddedRecord

{
  "id": "abc",
  "label": "Some Record - 1/14/2020",
  "source": "some_source_id",
  "publication_date": "2019-02-04",
  "acquisition_date": "2019-02-05",
  "references_count": 2,
  "record_url": "record/abc",
  "source_url": "https://entity.com/company/12345"
}

# PathSegment

{
  "field": "shareholder_of",
  "relationships": {
    "shareholder_of": {
      "values": [
        {
          "record": "ecdfb3f2ecc8c3797e77d5795a8066ef/123567",
          "attributes": {
            "shares": [
              {
                "percentage": 100,
                "monetary_value": 2100000,
                "currency": "USD"
              }
            ]
          },
          "date": "2018-06-14"
        }
      ]
    },
    "director_of": {
      "values": [
        {
          "record": "ecdfb3f2ecc8c3797e77d5795a8066ef/123567",
          "attributes": {
            "position": [{ "value": "Director" }]
          },
          "from_date": "2007-12-01",
          "to_date": "2015-05-01",
          "acquisition_date": "2021-04-14",
          "publication_date": "2021-04-14"
        }
      ],
      "former": true
    }
  },
  "entity": EmbeddedEntity
}

# PossiblySameAsMatches

{
  "name": [
    {
      "target": "John Smith",
      "source": "John C Smith"
    }
  ],
  "date_of_birth": [
    {
      "target": "1970-05-02",
      "source": "1970-05"
    }
  ]
}

# Resource Endpoints

# Entity

GET /v1/entity/:id
Accept: application/json

An entity represents a single real-world thing such as a person or company or land property that has been extracted from one or more records. The entity response includes information on that entity's attributes, relationships, as well as the records that entity is sourced to. Entity requests includes paginated lists of attributes and relationships, with limit defaulting to 100. Paginate the lists of attributes and relationships using the next or prev token included in the response.

# Arguments

attributes.[field].next: string optional
The pagination token for the next page of attribute `[field]`, e.g. name, address, or country.

attributes.[field].prev: string optional
The pagination token for the previous page of attribute `[field]`, e.g. name, address, or country.

attributes.[field].limit: integer optional
Limit total values returned for attribute `[field]`. Defaults to 100.

relationships.next: string optional
The pagination token for the next page of relationship results

relationships.prev: string optional
The pagination token for the previous page of relationship results

relationships.limit: integer optional
Limit total relationship values. Defaults to 100.

relationships.type: string optional
Filter relationships to relationship type, e.g. director_of or has_shareholder

relationships.sort: string optional
Sorts relationships by As Of date or Shareholder percentage, e.g. date or -shares

relationships.startDate: date optional
Filters relationships to after a date

relationships.endDate: date optional
Filters relationships to before a date

relationships.minShares: integer optional
Filters relationships to greater than or equal to a Shareholder percentage

relationships.country: string[] optional
Filters relationships to a list of countries

relationships.arrivalCountry: string[] optional
Filters shipment relationships to a list of arrival countries

relationships.departureCountry: string[] optional
Filters shipment relationships to a list of departure countries

relationships.hsCode: string optional
Filters shipment relationships to an HS code

possibly_same_as.next: string optional
The pagination token for the next page of possibly same entities.

possibly_same_as.prev: string optional
The pagination token for the previous page of possibly same entities.

possibly_same_as.limit: integer optional
Limit total possibly same as entities. Defaults to 100.

referenced_by.next: string optional
The pagination token for the next page of the entity's referencing records

referenced_by.prev: string optional
The pagination token for the previous page of the entity's referencing records

referenced_by.limit: integer optional
Limit totals values returned for entity's referencing records. Defaults to 100.

# Content Types

Supported content types include:

application/json
JSON response.
text/csv
CSV response.
application/pdf
PDF response.
application/vnd.ms-excel
Microsoft Excel XLSX response.

# Examples

GET /v1/entity/123?attributes.name.limit=10&relationships.next=wkdjtrsdre HTTP/1.1
Accept: application/json
{
    "id": "123",
    "label": "ACME Co.",
    "type": "company",
    "entity_url": "/v1/entity/123",
    "identifiers": [{ "type": "uk_company_number", "value": "12345" }],
    "countries": ["GBR"],
    "source_count": {
        "some_source_id": {
            "count": 2,
            "label": "UK Companies House",
        }
    },
    "relationship_count": {
        "has_shareholder": 300
    },
    "attributes": {
        "name": {
            "next": "b9dc2ca839c31",
            "limit": 100,
            "size": {
                "count": 4,
                "qualifier": "eq"
            },
            "data": [
                {
                    "properties": {
                        "value": "Acme Co."
                    },
                    "record": ["record_id"],
                    "record_count": 3
                },
                ...
            ]
        },
        ...
    },
    "relationships": {
        "next": "b9dc2ca839c31",
        "prev": "49dc2ca839c31",
        "limit": 100,
        "size": {
            "count": 300,
            "qualifier": "eq"
        },
        "data": [
            {
                "target": EmbeddedEntity,
                "types": {
                    "shareholder_of": [
                        {
                            "record": "record-id",
                            "attributes": {
                                "shares": [
                                    { "percentage": 100, "monetary_value": 2100000, "currency": "USD" }
                                ]
                            },
                            "date": "2018-06-14"
                        }
                    ]
                }
            },
            ...
        ]
    },
    "possibly_same_as": {
        "next": "b9dc2ca839c31",
        "limit": 100,
        "size": {
            "count": 2,
            "qualifier": "eq"
        },
        "data": [
            {
                "entity": EmbeddedEntity,
                "matches": PossiblySameAsMatches
            },
            ...
        ]
    },
    "referenced_by": {
        "next": "b9dc2ca839c31",
        "limit": 100,
        "size": {
            "count": 3,
            "qualifier": "eq"
        },
        "data": [
            {
                "record": EmbeddedRecord,
                "type": "about"
            },
            ...
        ]
    }
}

# Entity Summary

GET /v1/entity_summary/:id
Accept: application/json

The Entity Summary endpoint returns a smaller entity payload, including:

  • up to 50 values for each of the following attributes: name, address, identifier, weak_identifier, status, company_type, contact, business_purpose, country
  • up to 50 entities that are possibly the same as the target entity
  • up to 100 records the entity is sourced to

# Content Types

Supported content types include:

application/json
JSON encoded response.

# Examples

GET v1/entity_summary/123 HTTP/1.1
Accept: application/json
{
    "id": "123",
    "label": "ACME Corp.",
    "degree": 3,
    "risk": {
      "basel_aml": {
        "value": 4.63,
        "metadata": { "country": ["USA"] }
      },
      "cpi_score": {
        "value": 67,
        "metadata": { "country": ["USA"] }
      },
      "sanctioned_distance": {
        "value": 3,
        "metadata": { }
      }
    },
    "psa_count": 1,
    "type": "company",
    "entity_url": "/v1/entity/123",
    "identifiers": [{ "type": "uk_company_number", "value": "1234" }],
    "countries": ["GBR"],
    "source_count": {
        "ecdfb3f2ecc8c3797e77d5795a8066ef": {
            "count": 6,
            "label": "UK Corporate Registry"
        },
        "2a4fe9a14e332c8f9ded1f8a457c2b89": {
            "count": 12,
            "label": "UK Land Commercial and Corporate Ownership Data (CCOD)"
        },
    },
    "relationship_count": {
        "has_registered_agent": 1,
        "has_shareholder": 2,
        "linked_to": 2,
    },
    "attributes": {
        "identifier": {
            "offset": 0,
            "limit": 20,
            "next": false,
            "size": {
                "count": 1,
                "qualifier": "eq"
            },
            "data": [
                {
                    "properties": {
                        "value": "1234",
                        "type": "uk_company_number"
                    },
                    "record": ["abc"],
                    "record_count": 21
                }
            ]
        },
        ...
    }
}

# Traversal

GET /v1/traversal/:id
Accept: application/json

The Traversal endpoint returns paths from a single target entity to up to 50 directly or indirectly-related entities. Each path includes information on the 0 to 10 intermediary entities, as well as their connecting relationships. The response's explored_count field indicates the size of the graph subset the application searched. Running a traversal on a highly connected entity with a restrictive set of argument filters and a high max depth will require the application to explore a higher number of traversal paths, which may affect performance.

# Arguments

offset: integer optional
Offset values for traversal. Defaults to 0.

limit: integer optional
Limit total values for traversal. Defaults to 20.

min_depth: integer optional
Set minimum depth for traversal. Defaults to 1.

max_depth: integer optional
Set maximum depth for traversal. Defaults to 6.

relationships: string, string[] optional
Set relationship type(s) to follow when traversing related entities. Defaults to following all relationship types.

psa: boolean optional
Also traverse relationships from entities that are possibly the same as any entity that appears in the path. Defaults to not traversing possibly same as relationships.

countries: string, string[] optional
Filter paths to only those that end at an entity associated with the specified country(ies). Defaults to returning paths that end in any country.

types: string, string[] optional
Filter paths to only those that end at an entity of the specified type(s). Defaults to returning paths that end at any type.

sanctioned: boolean optional
Filter paths to only those that end at an entity appearing on a watchlist. Defaults to not filtering paths by sanctioned status.

pep: boolean optional
Filter paths to only those that end at an entity appearing on a pep list. Defaults to not filtering paths by pep status.

min_shares: integer optional
Set minimum percentage of share ownership for traversal. Defaults to 0.

include_unknown_shares: boolean optional
Also traverse relationships when share percentages are unknown. Only useful when min_shares is set greater than 0. Defaults to true.

exclude_former_relationships: boolean optional
Include relationships that were valid in the past but not at the present time. Defaults to false.

exclude_closed_entities: boolean optional
Include entities that existed in the past but not at the present time. Defaults to false.

The additional risk filters below will filter paths to only those that entity with an entity that we have flagged with the corresponding risk factor. Details about what these risk factors indicate can be found here.

eu_high_risk_third: boolean optional

reputational_risk_modern_slavery: boolean optional

state_owned: boolean optional

formerly_sanctioned: boolean optional

reputational_risk_terrorism: boolean optional

reputational_risk_organized_crime: boolean optional

reputational_risk_financial_crime: boolean optional

reputational_risk_bribery_and_corruption: boolean optional

reputational_risk_other: boolean optional

reputational_risk_cybercrime: boolean optional

regulatory_action: boolean optional

law_enforcement_action: boolean optional

xinjiang_geospatial: boolean optional

# Examples

GET /v1/traversal/123?max_depth=6&relationships=has_shareholder&relationships=branch_of&psa HTTP/1.1
Accept: application/json
{
    "min_depth": 1,
    "max_depth": 6,
    "relationships": ["has_shareholder", "branch_of"],
    "countries": [],
    "types": [],
    "psa": true,
    "offset": 0,
    "limit": 20,
    "next": true,
    "data": [
        {
            "target": EmbeddedEntity,
            "path": PathSegment[]
        },
        ...
    ],
    "explored_count": 1201
}

# UBO

GET /v1/ubo/:id
Accept: application/json

The UBO endpoint returns paths from a single target entity to up to 50 beneficial owners. The endpoint is a shorthand for the equivalent traversal query:

GET /v1/traversal/:id?relationships=has_shareholder&relationships=has_beneficial_owner&relationships=has_owner&relationships=subsidiary_of&relationships=branch_of

# Arguments

See Traversal

# Examples

See Traversal


# Ownership

GET /v1/downstream/:id
Accept: application/json

The Ownership endpoint returns paths from a single target entity to up to 50 entities directly or indirectly owned by that entity. The endpoint is a shorthand for the equivalent traversal query:

GET /v1/traversal/:id?relationships=shareholder_of&relationships=beneficial_owner_of&relationships=owner_of&relationships=has_subsidiary&relationships=has_branch

# Arguments

See Traversal

# Examples

See Traversal


# Watchlist

GET /v1/watchlist/:id
Accept: application/json

The Watchlist endpoint returns paths from a single target entity to up to 50 other entities that appear on a watchlist or are peps. The endpoint is a shorthand for the equivalent traversal query:

GET /v1/traversal/:id?watchlist

# Arguments

See Traversal

# Examples

See Traversal


# Shortest Path

GET /v1/shortest_path
Accept: application/json

The Shortest Path endpoint returns a response identifying the shortest traversal path connecting each pair of entities.

# Arguments

entities: string[]
Entity ids

# Examples

GET /v1/shortest_path?entities=123&entities=345 HTTP/1.1
Accept: application/json
{
    "entities": [
        "123",
        "345"
    ],
    "data": [
        {
            "target": EmbeddedEntity,
            "path": PathSegment[]
        }
    ]
}

# Record

GET /v1/record/:id
Accept: application/json

Records represent the documents used to source entities. The record resource includes metadata about the document itself as well as the entities extracted from the document.

# Arguments

references.limit: integer optional
A limit on the number of references to be returned. Defaults to 100.

references.offset: integer optional
Number of references to skip before returning response. Defaults to 0.

# Content Types

Supported content types include:

application/json
JSON encoded response.

# Examples

GET /v1/record/:id HTTP/1.1
Accept: application/json
{
    "id": "1",
    "label": "foo",
    "source_url": "/source/1",
    "publication_date": "2018-02-29",
    "acquisition_date": "2018-06-13",
    "references_count": 3,
    "record_url": "/record/1",
    "document_urls": [
        "/document/1/file/company-html.html",
    ],
    "references": {
        "next": false,
        "offset": 0,
        "limit": 100,
        "size": {
            "count": 1,
            "qualifier": "eq"
        },
        "data": EmbeddedEntity[]
    }
}

# Fields

id: string
Internal ID of the associated record.

label: string
Human readable label for the record.

source_url: string
Url to the source associated with the record.

publication_date: date
Date of record publication.

acquisition_date: date
The date Sayari acquired the source document.

references_count: integer
Count of entities referenced in source document.

record_url: string
Url to current record.

document_urls: string array optional
A list of document urls for downloading the underlying source document. If omitted then source document is not able to be downloaded.

references: paginated search entity
A list of embedded entity references with meta indicating the type of reference.

# Reference Types

An about reference gives a list of entities that the record is specifically about. In most cases there will be a single entity of this type.

The mentions reference is the other reference type and corresponds to another entity mentioned in the record.


GET /v1/search/[entity|record]
Accept: application/json

The search endpoint allows for the ability to search Sayari internal data using text queries. Both entity and records have associated search endpoints with common arguments. Note: searches are limited to a maximum of 10,000 results.

# Arguments

q: string
Query term. The syntax for the query parameter follows elasticsearch simple query string syntax. The includes the ability to use search operators and to perform nested queries. Must be url encoded.

filter: Filter array optional
Filters to be applied to search query to limit the result-set. Set as an object {"filter_key": "value"} where filter_key can be one of source, country, state, city, entity_type, bounds, or risk. Serialized to query string if not sent in POST request.

The country filter accepts ISO 3166 codes and state accepts abbreviations.

The bounds filter is a pipe-delimited (|) string of four numeric values representing a bounding box: the north latitude, west longitude, south latitude, and east longitude bounds. For example: 46.12|-76|45|-75

fields: string array optional
Record or entity fields to search against.

facets: boolean optional
Whether or not to return search facets in results giving counts by field. Defaults to false.

geo_facets: boolean optional
Whether or not to return search geo bound facets in results giving counts by geo tile. Defaults to false.

advanced: boolean optional
Set to true to enable full elasticsearch query string syntax which allows for fielded search and more complex operators. Note that the syntax is more strict and can result in empty result-sets. Defaults to false.

limit: integer optional
A limit on the number of objects to be returned with a range between 1 and 100. Defaults to 100.

offset: integer optional
Number of results to skip before returning response. Defaults to 0.

# Content Types

Supported content types include:

application/json
JSON encoded response.

text/csv
CSV string representation of response.

# Examples

To make a simple request use the q parameter to search against a query. The following are equivalent:

GET /v1/search/record?q=test HTTP/1.1
Accept: application/json

POST /v1/search/record HTTP/1.1
Accept: application/json

{
  "q": "test"
}

WARNING

All search arguments passed through the query string of the GET request must be url encoded. For example foo OR (bar~5) becomes foo%20OR%20(bar~5).

# Fields

To search against a specific field, use the fields parameter in the request. This request will search addresses and names for apple.

GET /v1/search/record?q=apple&fields=address&fields=name HTTP/1.1
Accept: application/json

POST /v1/search/record HTTP/1.1
Accept: application/json

{
  "q": "test",
  "fields": ["address", "name"]
}

The following fields are supported:

  • name
  • identifier
  • address
  • business_purpose
  • date_of_birth
  • contact

# Filters

Filters are parameters that limit the result-set post query. They can have different representations depending on request type. If using a GET request, filters can be added to the query string as an array with the format filter=id. For example, a filter against sources for UK Corporate Registry which has the id ecdfb3f2ecc8c3797e77d5795a8066ef would look like filters=source%3Decdfb3f2ecc8c3797e77d5795a8066ef. Alternatively, the filter can be expressed in the body of the request as a Filter object which looks like the following:

{
  "filters": {
    "source": ["ecdfb3f2ecc8c3797e77d5795a8066ef"]
  }
}

Notice the array type indicating that the query can be filtered by multiple sources. An example filtered request by multiple sources would appear as follows:

GET /v1/search/entity?filters=source=ecdfb3f2ecc8c3797e77d5795a8066ef&filters=source=4ea8bac1bed868e1510ffd21842e9551&q=example HTTP/1.1
Accept: application/json

POST /v1/search/entity HTTP/1.1
Accept: application/json
{
    "q": "test",
     "filters": {
         "source": [
             "ecdfb3f2ecc8c3797e77d5795a8066ef",
             "4ea8bac1bed868e1510ffd21842e9551"
         ]
     }
}

The following fields support filtering:

  • entity_type
  • source
  • country
  • risk

Supported arguments for risk filters include all boolean risk fields documented in the ontology here.

# Facets

A search response can display the facet results for filtering fields depending on the request. If the facets argument is applied then they will be provided in the response. For example, using the following request:

GET /v1/search/record?q=example&facets=true HTTP/1.1
Accept: application/json

The responses will look like:

{
  "offset": 0,
  "limit": 1,
  "next": true,
  "size": {
    "count": 10000,
    "qualifier": "gte"
  },
  "data": [
    {
      "id": "1",
      "label": "Foo Bar",
      "source_url": "/source/1",
      "publication_date": "2018-05-23",
      "acquisition_date": "2018-02-12",
      "references_count": 3,
      "record_url": "/record/1",
      "matches": {
        "address": ["Foo <em>example</em> Bar"]
      }
    }
  ],
  "facets": {
    "country": [
      {
        "key": "TUR",
        "label": "Turkey",
        "doc_count": 37701
      },
      {
        "key": "GBR",
        "label": "United Kingdom",
        "doc_count": 26706
      },
      {
        "key": "USA",
        "label": "United States",
        "doc_count": 10813
      }
    ],
    "entity_type": [
      {
        "key": "company",
        "doc_count": 36720
      },
      {
        "key": "property",
        "doc_count": 22484
      }
    ],
    "source": [
      {
        "key": "123465646464",
        "doc_count": 435631
      },
      {
        "key": "999596322424",
        "doc_count": 21078
      }
    ],
    "source_type": [
      {
        "key": "Company data",
        "doc_count": 441718
      },
      {
        "key": "Litigation data",
        "doc_count": 28980
      },
      {
        "key": "Other",
        "doc_count": 169
      }
    ],
    "region": [
      {
        "key": "USA & Canada",
        "doc_count": 435980
      },
      {
        "key": "Asia Pacific",
        "doc_count": 26014
      },
      {
        "key": "Latin America & Caribbean",
        "doc_count": 4099
      }
    ]
  }
}

# GeoFacets

A search response can display the geo_facet results for filtering fields depending on the request. If the facets and the geo_facets arguments is applied then they will be provided in the response. For example, using the following request:

GET /v1/search/record?q=example&facets=true&geo_facets=true HTTP/1.1
Accept: application/json

The responses will look like:

{
  "offset": 0,
  "limit": 1,
  "next": true,
  "size": {
    "count": 10000,
    "qualifier": "gte"
  },
  "data": [
    {
      "id": "1",
      "label": "Foo Bar",
      "source_url": "/source/1",
      "publication_date": "2018-05-23",
      "acquisition_date": "2018-02-12",
      "references_count": 3,
      "record_url": "/record/1",
      "matches": {
        "address": ["Foo <em>example</em> Bar"]
      }
    }
  ],
  "facets": {
    "location": [
      {
        "key": "4/4/6",
        "doc_count": 12500691
      },
      {
        "key": "4/13/6",
        "doc_count": 9194167
      }
    ],
    "country": [
      {
        "key": "TUR",
        "label": "Turkey",
        "doc_count": 37701
      },
      {
        "key": "GBR",
        "label": "United Kingdom",
        "doc_count": 26706
      },
      {
        "key": "USA",
        "label": "United States",
        "doc_count": 10813
      }
    ],
    "entity_type": [
      {
        "key": "company",
        "doc_count": 36720
      },
      {
        "key": "property",
        "doc_count": 22484
      }
    ],
    "source": [
      {
        "key": "123465646464",
        "doc_count": 435631
      },
      {
        "key": "999596322424",
        "doc_count": 21078
      }
    ],
    "source_type": [
      {
        "key": "Company data",
        "doc_count": 441718
      },
      {
        "key": "Litigation data",
        "doc_count": 28980
      },
      {
        "key": "Other",
        "doc_count": 169
      }
    ],
    "region": [
      {
        "key": "USA & Canada",
        "doc_count": 435980
      },
      {
        "key": "Asia Pacific",
        "doc_count": 26014
      },
      {
        "key": "Latin America & Caribbean",
        "doc_count": 4099
      }
    ]
  }
}

The search endpoint gives the ability for users to specify their query in full elasticsearch query string syntax (opens new window). This allows for fielded search with more advanced operators than available with the default query syntax. The following advanced query for name = foo, identifier = bar can be expressed in the following request with the advanced parameter is set to true and the query set to the uri-encoded equivalent of (name.value:foo OR identifier.value:bar).

GET /v1/search/entity?q=(name.value%3Afoo%20OR%20identifier.value%3Abar)&advanced=true HTTP/1.1
Accept: application/json, text/javascript

Alternatively, to search for an identifier of a specific type, use the url-encoded equivalent of (identifier.value:123 AND identifier.type:uk_company_number). The list of available fields is as follows:

  • name.value
  • identifier.value, identifier.type. For a list of possible identifier types, see the ontology documentation for identifier types
  • address.value
  • business_purpose.value
  • date_of_birth.value
  • contact.value, contact.type. For a list of possible contact types, see the ontology documentation for contact types

To search against sayari entities use the /search/entity endpoint. The will give a paginated response of Embedded Entities along with the fields that the search matched against.

Request:

GET /v1/search/entity?q=example&limit=1 HTTP/1.1
Accept: application/json, text/javascript

Response:

{
  "offset": 0,
  "limit": 1,
  "next": true,
  "size": {
    "count": 10000,
    "qualifier": "gte"
  },
  "data": [
    {
      "id": "1",
      "label": "example company",
      "type": "company",
      "entity_url": "/entity/1",
      "identifiers": [
        {
          "type": "test_identifier",
          "value": "1"
        }
      ],
      "countries": ["VEN"],
      "source_count": {
        "1122": {
          "count": 63,
          "label": "UK Companies House"
        }
      },
      "relationship_count": {
        "has_employee": 63
      },
      "matches": {
        "name": ["<em>example company</em>"]
      }
    }
  ]
}

To search against sayari records and documents use the /v1/search/record endpoint. The will give a paginated response of Embedded Records along with the fields that the search matched against.

Request:

GET /v1/search/record?q=test&limit=1 HTTP/1.1
Accept: application/json, text/javascript

Response:

{
  "offset": 0,
  "limit": 1,
  "next": true,
  "size": {
    "count": 10000,
    "qualifier": "gte"
  },
  "data": [
    {
      "id": "1",
      "label": "Foo Bar",
      "source_url": "/source/1",
      "publication_date": "2018-05-23",
      "crawl_date": "2018-02-12",
      "acquisition_date": 3,
      "record_url": "/record/1",
      "matches": {
        "address": ["Foo <em>example</em> Bar"]
      }
    }
  ]
}

# Resolution

The resolution endpoints allow users to search for matching entities against a provided list of attributes. The endpoint is similar to the search endpoint, except it's tuned to only return the best match so the client doesn't need to do as much or any post-processing work to filter down results. Supported attributes can include one or more of the below fields as well as the identifier type enums listed here:

  • name
  • identifier
  • country
  • address
  • date_of_birth
  • contact
  • type

If multiple values are passed for any field, the endpoint will match entities with ANY of the values. Available type values are listed here. Available country values are any ISO 3166-1 alpha-3 country codes listed here.

# GET

When making a GET request, pass attribute values as query string parameters:

Request

GET /v1/resolution?name=Institute of the Russian Diaspora (AKA Institute of Russian Abroad)&name=ИНСТИТУТ РУССКОГО ЗАРУБЕЖЬЯ&identifier=1057746409379&ru_inn=7727536630&country=rus HTTP/1.1
Accept: application/json

Response:

{
  "fields": {
    "name": [
      "Institute of the Russian Diaspora (AKA Institute of Russian Abroad)",
      "ИНСТИТУТ РУССКОГО ЗАРУБЕЖЬЯ"
    ],
    "identifier": ["1057746409379"],
    "ru_inn": ["7727536630"],
    "country": ["rus"]
  },
  "data": [
    {
      "score": 98.4,
      "entity_id": "3N07LU74r3J_pO9HeGivAQ",
      "label": "АВТОНОМНАЯ НЕКОММЕРЧЕСКАЯ ОРГАНИЗАЦИЯ \"ИНСТИТУТ РУССКОГО ЗАРУБЕЖЬЯ\"",
      "type": "company",
      "identifiers": [
        {
          "type": "ru_ogrn",
          "value": "1057746409379",
          "label": "Ru Ogrn"
        },
        {
          "type": "ru_inn",
          "value": "7727536630",
          "label": "Ru Inn"
        },
        {
          "type": "ru_kpp",
          "value": "772701001",
          "label": "Ru Kpp"
        }
      ],
      "addresses": [
        "117218, ГОРОД МОСКВА, УЛИЦА КРЖИЖАНОВСКОГО, 13, СТР.2",
        "117218, ГОРОД МОСКВА, УЛ. КРЖИЖАНОВСКОГО, Д.13, СТР.2",
        "117218 ГОРОД МОСКВА УЛИЦА КРЖИЖАНОВСКОГО 13 СТР.2",
        "117218, г Москва, улица Кржижановского, 13 СТР.2"
      ],
      "countries": ["RUS"],
      "sources": [
        "RUS/zachestnyibiznes"
      ],
      "matched_queries": ["identifier", "country", "name"],
      "highlight": {
        "name": [
          "Autonomous Non-profit Organization <em>Institute</em> <em>Of</em> <em>The</em> <em>Russian</em> <em>Abroad</em>"
        ],
        "identifier": ["<em>7727536630</em>", "<em>1057746409379</em>"],
        "country": ["<em>RUS</em>"]
      },
      "explanation": {
        "name": [
          {
            "matched": "<em>ИНСТИТУТ</em> <em>РУССКОГО</em> <em>ЗАРУБЕЖЬЯ</em>",
            "uploaded": "ИНСТИТУТ РУССКОГО ЗАРУБЕЖЬЯ"
          }
        ]
      }
    }
  ]
}

# POST

The POST endpoint expects a JSON object of attribute and value array pairs.

Request:

POST /v1/resolution HTTP/1.1
Content-Type: application/json
Accept: application/json
{
  "name": [
    "Institute of the Russian Diaspora (AKA Institute of Russian Abroad)",
    "ИНСТИТУТ РУССКОГО ЗАРУБЕЖЬЯ"
  ],
  "identifier": ["1057746409379"],
  "ru_inn": ["7727536630"],
  "country": ["rus"]
}

Response:

{
  "fields": {
    "name": [
      "Institute of the Russian Diaspora (AKA Institute of Russian Abroad)",
      "ИНСТИТУТ РУССКОГО ЗАРУБЕЖЬЯ"
    ],
    "identifier": ["1057746409379"],
    "ru_inn": ["7727536630"],
    "country": ["rus"]
  },
  "data": [
    {
      "entity_id": "3N07LU74r3J_pO9HeGivAQ",
      "label": "АВТОНОМНАЯ НЕКОММЕРЧЕСКАЯ ОРГАНИЗАЦИЯ \"ИНСТИТУТ РУССКОГО ЗАРУБЕЖЬЯ\"",
      "type": "company",
      "identifiers": [
        {
          "type": "ru_ogrn",
          "value": "1057746409379",
          "label": "Ru Ogrn"
        },
        {
          "type": "ru_inn",
          "value": "7727536630",
          "label": "Ru Inn"
        },
        {
          "type": "ru_kpp",
          "value": "772701001",
          "label": "Ru Kpp"
        }
      ],
      "addresses": [
        "117218, ГОРОД МОСКВА, УЛИЦА КРЖИЖАНОВСКОГО, 13, СТР.2",
        "117218, ГОРОД МОСКВА, УЛ. КРЖИЖАНОВСКОГО, Д.13, СТР.2",
        "117218 ГОРОД МОСКВА УЛИЦА КРЖИЖАНОВСКОГО 13 СТР.2",
        "117218, г Москва, улица Кржижановского, 13 СТР.2"
      ],
      "countries": ["RUS"],
      "matched_queries": ["identifier", "country", "name"],
      "highlight": {
        "name": [
          "Autonomous Non-profit Organization <em>Institute</em> <em>Of</em> <em>The</em> <em>Russian</em> <em>Abroad</em>"
        ],
        "identifier": ["<em>7727536630</em>", "<em>1057746409379</em>"],
        "country": ["<em>RUS</em>"]
      }
    }
  ]
}

# Sources

The sources endpoint returns metadata for all sources that Sayari collects data from.

GET /v1/source/:id
Accept: application/json
GET /v1/sources
Accept: application/json

# Arguments

limit: integer optional
A limit on the number of objects to be returned with a range between 1 and 100. Defaults to 100.

offset: integer optional
Number of results to skip before returning response. Defaults to 0.

# Content Types

Supported content types include:

application/json
JSON encoded response.

# Source by ID

GET /v1/source/b9dc2ca839c318d04910a8a680131fdf HTTP/1.1
Accept: application/json, text/javascript
{
  "id": "b9dc2ca839c318d04910a8a680131fdf",
  "label": "Albania Trade Register Extracts",
  "description": "Contains records for companies and sole proprietorships and external hyperlinks to additional official documentation. Provides key personnel, including managers and shareholders, and standard company information; documents from external hyperlinks contain individual ID numbers.",
  "country": "ALB",
  "region": "europe_&_central_asia",
  "date_added": "2019-11-22",
  "source_type": "company_data",
  "record_type": "company_record",
  "structure": "unstructured",
  "source_url": "http://www.qkr.gov.al/kerko/kerko-ne-regjistrin-tregtar/kerko-per-subjekt/",
  "pep": false,
  "watchlist": false
}

# List all Sources

GET /v1/sources?limit=10 HTTP/1.1
Accept: application/json, text/javascript
{
  "offset": 0,
  "limit": 2,
  "size": {
    "count": 124,
    "qualifier": "eq"
  },
  "next": true,
  "data": [
    {
      "id": "b9dc2ca839c318d04910a8a680131fdf",
      "label": "Albania Trade Register Extracts",
      "description": "Contains records for companies and sole proprietorships and external hyperlinks to additional official documentation. Provides key personnel, including managers and shareholders, and standard company information; documents from external hyperlinks contain individual ID numbers.",
      "country": "ALB",
      "region": "europe_&_central_asia",
      "date_added": "2019-11-22",
      "source_type": "company_data",
      "record_type": "company_record",
      "structure": "unstructured",
      "source_url": "http://www.qkr.gov.al/kerko/kerko-ne-regjistrin-tregtar/kerko-per-subjekt/",
      "pep": false,
      "watchlist": false
    },
    {
      "id": "4ea8bac1bed868e1510ffd21842e9551",
      "label": "Albania Trade Register Bulletins",
      "description": "Contains historical company registry filing notices indexed by date, as well as external hyperlinks to additional official documentation. Provides standard company information; documents from external hyperlinks contain management and ownership information.",
      "country": "ALB",
      "region": "europe_&_central_asia",
      "date_added": "2019-11-12",
      "source_type": "company_data",
      "record_type": "company_record",
      "structure": "unstructured",
      "source_url": "http://www.qkr.gov.al/newsroom/buletini/",
      "pep": false,
      "watchlist": false
    }
  ]
}


# Info Endpoints

# Usage

The usage endpoint provides a simple interface to retrieve information on usage made by your API account. This includes both views per API path and credits consumed. The time period for the usage query is also specified in the response and whether or not this includes total usage.

GET /usage
Accept: application/json

# Arguments

from: date optional
An ISO 8601 encoded date string indicating the starting time period to obtain usage stats. In the format YYYY-MM-DD
to: date optional
An ISO 8601 encoded date string indicating the ending time period to obtain usage stats. In the format YYYY-MM-DD

# Content Types

Supported content types include:

application/json
JSON encoded response.

# Examples

GET /usage
{
  "usage": {
    "get_entity": 1,
    "search": 1,
    "search_entities": 1,
    "search_records": 1
  },
  "from": "2020-05-22T22:22:22.222Z",
  "to": "2020-06-02T12:22:22.222Z"
}

Note: If usage states were not aggregated on the specified date the next closest date will be provided.

GET /usage?from=2020-05-21&to=2021-05-28
{
  "usage": {
    "get_entity": 5,
    "search": 2,
    "search_entities": 4,
    "search_records": 5
  },
  "from": "2020-05-21T00:00:00Z",
  "to": "2021-05-28T00:00:00Z"
}

# History

The history endpoint return a user's event history

GET /history
Accept: application/json

# Arguments

events: string[] optional
The type of events to filter on.
from: date optional
An ISO 8601 encoded date string indicating the starting time period for the events. In the format YYYY-MM-DD
to: date optional
An ISO 8601 encoded date string indicating the ending time period for the events. In the format YYYY-MM-DD
size: integer optional
Size to limit number of events returned
token: string optional
Pagination token to retrieve the next page of results

# Content Types

Supported content types include:

application/json
JSON encoded response.

# Examples

GET /history?from=2020-02-22&to=2021-05-29
{
    "events": [
         {
            "user": "auth0|abcd",
            "environment": "production",
            "event": "add_to_graph",
            "data": {
                "countries": [
                    "UKR"
                ],
                "email": "[email protected]",
                "groups": [
                    "Group-A"
                ],
                "ip": 0,
                "level": "info",
                "message": "add_to_graph",
                "size": 51,
                "sources": [
                    "defddbef5048d6"
                ]
            },
            "timestamp": "2021-05-20T00:13:14.307Z"
        },
        ...
    ]
}