Alternative data (in
finance
Finance refers to monetary resources and to the study and Academic discipline, discipline of money, currency, assets and Liability (financial accounting), liabilities. As a subject of study, is a field of Business administration, Business Admin ...
) refers to
data
Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
used to obtain insight into the
investment
Investment is traditionally defined as the "commitment of resources into something expected to gain value over time". If an investment involves money, then it can be defined as a "commitment of money to receive more money later". From a broade ...
process.
These
data set
A data set (or dataset) is a collection of data. In the case of tabular data, a data set corresponds to one or more table (database), database tables, where every column (database), column of a table represents a particular Variable (computer sci ...
s are often used by
hedge fund
A hedge fund is a Pooling (resource management), pooled investment fund that holds Market liquidity, liquid assets and that makes use of complex trader (finance), trading and risk management techniques to aim to improve investment performance and ...
managers and other
institutional investment professionals within an
investment company
An investment company is a financial institution principally engaged in holding, managing and investing securities. These companies in the United States are regulated by the U.S. Securities and Exchange Commission and must be registered under th ...
.
Alternative data sets are information about a particular company that is published by sources outside of the company, which can provide unique and timely insights into
investment
Investment is traditionally defined as the "commitment of resources into something expected to gain value over time". If an investment involves money, then it can be defined as a "commitment of money to receive more money later". From a broade ...
opportunities.
Alternative
data set
A data set (or dataset) is a collection of data. In the case of tabular data, a data set corresponds to one or more table (database), database tables, where every column (database), column of a table represents a particular Variable (computer sci ...
s are often categorized as
big data
Big data primarily refers to data sets that are too large or complex to be dealt with by traditional data processing, data-processing application software, software. Data with many entries (rows) offer greater statistical power, while data with ...
, which means that they may be very large and complex and often cannot be handled by
software
Software consists of computer programs that instruct the Execution (computing), execution of a computer. Software also includes design documents and specifications.
The history of software is closely tied to the development of digital comput ...
traditionally used for storing or handling data, such as
Microsoft Excel
Microsoft Excel is a spreadsheet editor developed by Microsoft for Microsoft Windows, Windows, macOS, Android (operating system), Android, iOS and iPadOS. It features calculation or computation capabilities, graphing tools, pivot tables, and a ...
. An alternative data set can be compiled from various sources such as
financial transaction
A financial transaction is an Contract, agreement, or communication, between a buyer and seller to exchange goods, Service (economics), services, or assets for payment. Any transaction involves a change in the status of the finances of two or mo ...
s,
sensors
A sensor is often defined as a device that receives and responds to a signal or stimulus. The stimulus is the quantity, property, or condition that is sensed and converted into electrical signal.
In the broadest definition, a sensor is a devi ...
,
mobile devices
A mobile device or handheld device is a computer small enough to hold and operate in hand. Mobile devices are typically battery-powered and possess a flat-panel display and one or more built-in input devices, such as a touchscreen or keypad. Mod ...
,
satellites
A satellite or an artificial satellite is an object, typically a spacecraft, placed into orbit around a celestial body. They have a variety of uses, including communication relay, weather forecasting, navigation ( GPS), broadcasting, scientif ...
,
public records
Public records are documents or pieces of information that are not considered confidential and generally pertain to the conduct of government.
Depending on jurisdiction, examples of public records includes information pertaining to births, deat ...
, and the
internet
The Internet (or internet) is the Global network, global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a internetworking, network of networks ...
.
Alternative data can be compared with data that is traditionally used by
investment companies such as investor presentations,
SEC filings
The SEC filing is a financial statements, financial statement or other formal document submitted to the U.S. Securities and Exchange Commission (SEC). Public company, Public companies, certain insiders, and broker-dealers are required to make regu ...
, and
press release
A press release (also known as a media release) is an official statement delivered to members of the news media for the purpose of providing new information, creating an official statement, or making an announcement directed for public releas ...
s. These examples of "traditional data" are produced directly by the company itself.
Since alternative data sets originate as a product of a company's operations, these data sets are often less readily accessible and less structured than traditional sources of data.
Alternative
data
Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
is also known as "
data exhaust". The company that produces alternative data generally overlooks the value of the data to
institutional investor
An institutional investor is an entity that pools money to purchase securities, real property, and other investment assets or originate loans. Institutional investors include commercial banks, central banks, credit unions, government-linked ...
s. During the last decade, many data
broker
A broker is a person or entity that arranges transactions between a buyer and a seller. This may be done for a commission when the deal is executed. A broker who also acts as a seller or as a buyer becomes a principal party to the deal. Neither ...
s,
aggregators, and other intermediaries began specializing in providing alternative data to
investor
An investor is a person who allocates financial capital with the expectation of a future Return on capital, return (profit) or to gain an advantage (interest). Through this allocated capital the investor usually purchases some species of pr ...
s and
analysts.
Types
Examples of alternative data include:
* Geolocation (foot traffic)
*
Credit card transactions
*
Email receipts
*
Point-of-sale transactions
*
Web site usage
* Mobile App or App Store analytics
*
Crowdsourcing
Crowdsourcing involves a large group of dispersed participants contributing or producing goods or services—including ideas, votes, micro-tasks, and finances—for payment or as volunteers. Contemporary crowdsourcing often involves digit ...
* Obscure city hall records
*
Satellite image
Satellite images (also Earth observation imagery, spaceborne photography, or simply satellite photo) are images of Earth collected by imaging satellites operated by governments and businesses around the world. Satellite imaging companies sell i ...
s
*
Social media
Social media are interactive technologies that facilitate the Content creation, creation, information exchange, sharing and news aggregator, aggregation of Content (media), content (such as ideas, interests, and other forms of expression) amongs ...
posts
*
Online
In computer technology and telecommunications, online indicates a state of connectivity, and offline indicates a disconnected state. In modern terminology, this usually refers to an Internet connection, but (especially when expressed as "on lin ...
browsing
Browsing is a kind of orienting strategy. It is supposed to identify something of relevance for the browsing organism. In context of humans, it is a metaphor taken from the animal kingdom. It is used, for example, about people browsing open sh ...
activity
* Shipping container receipts
* Product
review
A review is an evaluation of a publication, product, service, or company or a critical take on current affairs in literature, politics or culture. In addition to a critical evaluation, the review's author may assign the work a content rating, ...
s
* Price trackers
* Shipping trackers
* Internet activity and quality data
Uses
Alternative data is being used by fundamental and quantitative
institutional investor
An institutional investor is an entity that pools money to purchase securities, real property, and other investment assets or originate loans. Institutional investors include commercial banks, central banks, credit unions, government-linked ...
s to create innovative sources of
alpha
Alpha (uppercase , lowercase ) is the first letter of the Greek alphabet. In the system of Greek numerals, it has a value of one. Alpha is derived from the Phoenician letter ''aleph'' , whose name comes from the West Semitic word for ' ...
. The field is still in the early phases of development, yet depending on the resources and risk tolerance of a
fund Fund may refer to:
* Funding is the act of providing resources, usually in form of money, or other values such as effort or time, for a project, a person, a business, or any other private or public institution
** The process of soliciting and gathe ...
, multiple approaches abound to participate in this new paradigm.
The process to extract benefits from alternative data can be extremely challenging. The
analytics
Analytics is the systematic computational analysis of data or statistics. It is used for the discovery, interpretation, and communication of meaningful patterns in data, which also falls under and directly relates to the umbrella term, data sc ...
,
systems
A system is a group of interacting or interrelated elements that act according to a set of rules to form a unified whole. A system, surrounded and influenced by its environment, is described by its boundaries, structure and purpose and is exp ...
, and technologies for processing such data are relatively new and most
institutional investor
An institutional investor is an entity that pools money to purchase securities, real property, and other investment assets or originate loans. Institutional investors include commercial banks, central banks, credit unions, government-linked ...
s do not have capabilities to integrate alternative data into their
investment
Investment is traditionally defined as the "commitment of resources into something expected to gain value over time". If an investment involves money, then it can be defined as a "commitment of money to receive more money later". From a broade ...
decision process. However, with the right tools and strategy, a fund can mitigate costs while creating an enduring competitive advantage.
Most alternative data research projects are lengthy and resource intensive; therefore, due-diligence is required before working with a data set. The due-diligence should include an approval from the compliance team, validation of processes that create and deliver this data set, and identification of investment insights that can be additive to the investment process.
However, the usage of the alternative data is not restricted by investment sphere, it is successfully used in economics and politics as well as retail and e-commerce spheres. It is possible to predict
geopolitical risk through a profound alternative data analysis, while social media sites reveal a host of data for consumer sentiment analysis.
Methodology
Alternative data can be accessed via:
*
Web scraping
Web scraping, web harvesting, or web data extraction is data scraping used for data extraction, extracting data from websites. Web scraping software may directly access the World Wide Web using the Hypertext Transfer Protocol or a web browser. W ...
(or web Harvesting, performed by
computer programmer
A programmer, computer programmer or coder is an author of computer source code someone with skill in computer programming.
The professional titles ''software developer'' and ''software engineer'' are used for jobs that require a progr ...
s that design an
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
that searches websites for specific data on a desired topic)
* Acquisition of
Raw data
Raw data, also known as primary data, are ''data'' (e.g., numbers, instrument readings, figures, etc.) collected from a source. In the context of examinations, the raw data might be described as a raw score (after test scores).
If a scientist ...
* Third-party
Licensing
A license (American English) or licence ( Commonwealth English) is an official permission or permit to do, use, or own something (as well as the document of that permission or permit).
A license is granted by a party (licensor) to another par ...
Analysis
In
finance
Finance refers to monetary resources and to the study and Academic discipline, discipline of money, currency, assets and Liability (financial accounting), liabilities. As a subject of study, is a field of Business administration, Business Admin ...
, Alternative data is often analyzed in the following ways:
*
Scarcity
In economics, scarcity "refers to the basic fact of life that there exists only a finite amount of human and nonhuman resources which the best technical knowledge is capable of using to produce only limited maximum amounts of each economic good. ...
: the data
Information overload
Information overload (also known as infobesity, infoxication, or information anxiety) is the difficulty in understanding an issue and Decision making, effectively making decisions when one has too much information (TMI) about that issue, and is ...
within
financial market
A financial market is a market in which people trade financial securities and derivatives at low transaction costs. Some of the securities include stocks and bonds, raw materials and precious metals, which are known in the financial marke ...
s
*
Granularity
Granularity (also called graininess) is the degree to which a material or system is composed of distinguishable pieces, "granules" or "grains" (metaphorically).
It can either refer to the extent to which a larger entity is subdivided, or the ...
: the level of detail and aggregation of data (including time)
* History: the trajectory of data
*
Structure
A structure is an arrangement and organization of interrelated elements in a material object or system, or the object or system so organized. Material structures include man-made objects such as buildings and machines and natural objects such as ...
: the form of the data (
csv, json etc.)
* Coverage: the stocks or geographical locations that data can be linked with
Best practices
While
compliance and internal regulation are widely practiced in the alternative
data
Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
field, there exists a need for an industry-wide
best practices
A best practice is a method or technique that has been generally accepted as superior to alternatives because it tends to produce superior results. Best practices are used to achieve quality as an alternative to mandatory standards. Best practice ...
standard. Such a standard should address
personally identifiable information
Personal data, also known as personal information or personally identifiable information (PII), is any information related to an identifiable person.
The abbreviation PII is widely used in the United States, but the phrase it abbreviates has fou ...
(PII) obfuscation and access scheme requirements among other issues. Compliance professionals and decision makers can benefit from proactively creating internal
guidelines
A guideline is a statement by which to determine a course of action. It aims to streamline particular processes according to a set routine or sound practice. They may be issued by and used by any organization (governmental or private) to make ...
for data operations. Publications such as NIST 800-122
provide guidelines for protecting
PII and are useful when developing internal best practices.
Investment Data Standards Organization (IDSO) was established to develop, maintain, and promote industry-wide standards and best practices for the Alternative Data industry.
Web scraping
Legal aspects surrounding
web scraping
Web scraping, web harvesting, or web data extraction is data scraping used for data extraction, extracting data from websites. Web scraping software may directly access the World Wide Web using the Hypertext Transfer Protocol or a web browser. W ...
of alternative data have yet to be defined. Current best practices address the following issues when determining legal compliance of
web crawling
Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web and that is typically operated by search engines for the purpose of Web indexing (''web spider ...
operations:
* Review of the terms and conditions associated with the websites crawled
* Control over the potential interference with crawled websites
Web scraped data refers to data harvested from public websites. With 4 billion webpages and 1.2 million terabytes of data on the internet, there is a mountain of information that can be valuable to investors when analyzing a corporate performance.
The companies that specialize in this type of
data collection
Data collection or data gathering is the process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes. Data collection is a research com ...
, like Thinknum Alternative Data, write programs that access targeted websites and collect and store the scraped information on a periodic basis. In some cases web scraping requires use of public APIs as a way to access the data within those pages directly without visiting the actual website.
Types of web scraped data include:
* Job listings: A company that is increasing hiring and headcount is likely experiencing growth.
* Company ratings: Sites like Glassdoor allows employees to rate their company; increasing ratings, especially (in conjunction with increasing job listings) can be another growth indicator.
* Online retail data: High product rankings on online retailers suggest strong sales for those product manufacturers. On the flip side, heavy discounting of products suggest weak sales.
Standards Board for Alternative Investment (SBAI) is the global standard-setting agency for the alternative investment industry and guardian of the Alternative Investment Standards. The agency supported by approximately 200 alternative investment managers and institutional investors and collectively manage $3.5 trillion. The
SBAI has published the Standardised Trial Data License Agreement which addresses investment managers' issues when comes to new data trailing process, like alternative data and
big data
Big data primarily refers to data sets that are too large or complex to be dealt with by traditional data processing, data-processing application software, software. Data with many entries (rows) offer greater statistical power, while data with ...
. Thomas Deinet, Executive Director of the SBAI said: "This Trial Data Licence Agreement template highlights a number of very important issues, including personal data protection, which has become a hot topic in light of the overhaul of data protection regulation in many jurisdictions. It also includes key protections for managers in areas such as prevention of insider trading and 'right to use data'. It is crucial that managers and data vendors fully understand all risks when selling and using new data."
publishes Standardised Trial Data License Agreement."6 February 2019. Retrieved 15 May 2019.
/ref>
See also
* Fintech
Financial technology (abbreviated as fintech) refers to the application of innovative technologies to products and services in the financial industry. This broad term encompasses a wide array of technological advancements in financial services, ...
References
{{Reflist
Further reading
* Alexander Denev and Saeed Amen
''The Book of Alternative Data: A Guide for Investors, Traders and Risk Managers'' (Wiley 2020)
* Marko Kolanovic and Rajesh T. Krishnamachari, ''Big Data & AI Strategies: Machine Learning and Alternative Data Approach to Investing'' (JP Morgan 2018)
Investment