# Overview

Sayari's bulk data provides access to billions of entities (i.e., vertices, nodes) and relationships (i.e., edges) as displayed in Sayari's suite of products. This documentation outlines the data structure, supported data formats, and data delivery mechanisms of Sayari's bulk data offering.

# Data Structure

Our data is delivered in two sets of files: entities and relationships. Within each file, a row describes one entity or relationship.

Note

Any entity or relationship may have multiple attributes of the same type; for example, an entity may have multiple addresses (physical, mailing, etc.). Accordingly, all attributes are included as arrays.

# Entities

Like an entity profile in the Sayari suite of products, a row in an entity file describes a single entity, including its attributes, risk factors, and summary information.

Summary information includes properties that describe the entity, as outlined below:

Field Type Description
entity_id string Primary key
type string Entity type, see Entities
label string Most commonly reported name
label_en string Most commonly reported American Standard Code for Information Interchange (ASCII) name
closed boolean Whether an entity is closed
degree long Number of unique neighboring entities
edge_counts map Number of neighbors per edge type
sanctioned boolean See Sanctioned
pep boolean See Politically Exposed Person (PEP)
source array[string] List of data sources an entity was referenced in
num_documents long Number of source documents an entity was referenced in

# Relationships

A row in a relationship file describes a single relationship, including its attributes and summary information. Relationships connect two entities (i.e., vertices), which are specified by their entity_ids.

Field Type Description
src string entity_id of the tail vertex
dst string entity_id of the head vertex
type string Relationship type, see Relationships
from_date string Start date of a relationship
date string As-of date of a relationship
to_date string End date of a relationship

# Data Formats

# Data Delivery

# Signed URLs

Sayari provides bulk data access via signed URLs (opens new window). Signed URLs provide time-limited access to their corresponding data files. Signed URLs are delivered as a text file of newline delimited URLs.

Note

Signed URLs are valid for a maximum of 7 days.

# Example usage

wget -i signed_urls.txt -c -t 3

# Suggested tools

# SFTP Download

Sayari provides bulk data access via SFTP (opens new window). SFTP requires generating a Secure Shell (SSH) key pair and sharing a public key with Sayari. For guidance on how to generate an SSH key pair, please review the following tutorial: Generate a Secure Shell (SSH) key pair for an SFTP dropbox (opens new window). After receiving a public key, Sayari will provide the corresponding username.

# Example usage

sftp [email protected]
sftp> get path/to/file

# Suggested tools