CodeQL
   HOME

TheInfoList



OR:

Semmle Inc is a code-analysis platform; Semmle was acquired by
GitHub GitHub () is a Proprietary software, proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug trackin ...
(itself owned by
Microsoft Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
) on 18 September 2019 for an undisclosed amount. Semmle's LGTM technology automates
code review Code review (sometimes referred to as peer review) is a software quality assurance activity in which one or more people examine the source code of a computer program, either after implementation or during the development process. The persons perf ...
, tracks developer contributions, and flags software security issues. The LGTM platform leverages the CodeQL query engine (formerly QL) to perform semantic analysis on software code bases. GitHub aims to integrate Semmle technology to provide continuous vulnerability detection services. In November 2019, use of CodeQL was made free for research and open source. CodeQL either shares a direct pedigree with .QL (dot-que-ell), which derives from the
Datalog Datalog is a declarative logic programming language. While it is syntactically a subset of Prolog, Datalog generally uses a bottom-up rather than top-down evaluation model. This difference yields significantly different behavior and properties ...
family tree, or is an evolution of similar technology. SemmleCode is an
object-oriented Object-oriented programming (OOP) is a programming paradigm based on the concept of '' objects''. Objects can contain data (called fields, attributes or properties) and have actions they can perform (called procedures or methods and impleme ...
query language A query language, also known as data query language or database query language (DQL), is a computer language used to make queries in databases and information systems. In database systems, query languages rely on strict theory to retrieve informa ...
for deductive
database In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
s developed by Semmle. It is distinguished within this class by its support for
recursive Recursion occurs when the definition of a concept or process depends on a simpler or previous version of itself. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in m ...
query.


Corporate background

The company was headquartered in
San Francisco San Francisco, officially the City and County of San Francisco, is a commercial, Financial District, San Francisco, financial, and Culture of San Francisco, cultural center of Northern California. With a population of 827,526 residents as of ...
, with its development operations based in Blue Boar Court,
Alfred Street Alfred Street is a street running between the High Street, Oxford, High Street to the north and the junction with Blue Boar Street and Bear Lane at the southern end, in central Oxford, England.
, central
Oxford Oxford () is a City status in the United Kingdom, cathedral city and non-metropolitan district in Oxfordshire, England, of which it is the county town. The city is home to the University of Oxford, the List of oldest universities in continuou ...
,
England England is a Countries of the United Kingdom, country that is part of the United Kingdom. It is located on the island of Great Britain, of which it covers about 62%, and List of islands of England, more than 100 smaller adjacent islands. It ...
. Semmle's customers included
Credit Suisse Credit Suisse Group AG (, ) was a global Investment banking, investment bank and financial services firm founded and based in Switzerland. According to UBS, eventually Credit Suisse was to be fully integrated into UBS. While the integration ...
,
NASA The National Aeronautics and Space Administration (NASA ) is an independent agencies of the United States government, independent agency of the federal government of the United States, US federal government responsible for the United States ...
, and
Dell Dell Inc. is an American technology company that develops, sells, repairs, and supports personal computers (PCs), Server (computing), servers, data storage devices, network switches, software, computer peripherals including printers and webcam ...
.


SemmleCode background


Academic

SemmleCode builds on academic research on querying the source of software programs. The first such system was Linton's Omega system, where queries were phrased in QUEL. QUEL did not allow for
recursion Recursion occurs when the definition of a concept or process depends on a simpler or previous version of itself. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in m ...
in queries, making it difficult to inspect hierarchical program structures such as the
call graph A call graph (also known as a call multigraph) is a control-flow graph, which represents calling relationships between subroutines in a computer program. Each node represents a procedure and each edge ''(f, g)'' indicates that procedure ''f'' c ...
. The next significant development was therefore the use of
logic programming Logic programming is a programming, database and knowledge representation paradigm based on formal logic. A logic program is a set of sentences in logical form, representing knowledge about some problem domain. Computation is performed by applyin ...
, which does allow such recursive queries, in the XL C++ Browser. The disadvantage of using a full logic programming language is however that it is very difficult to attain acceptable efficiency. The CodeQuest system, developed at the
University of Oxford The University of Oxford is a collegiate university, collegiate research university in Oxford, England. There is evidence of teaching as early as 1096, making it the oldest university in the English-speaking world and the List of oldest un ...
, was the first to exploit the observation that
Datalog Datalog is a declarative logic programming language. While it is syntactically a subset of Prolog, Datalog generally uses a bottom-up rather than top-down evaluation model. This difference yields significantly different behavior and properties ...
, a very restrictive version of logic programming, is in the sweet spot between expressive power and efficiency. The QL
query language A query language, also known as data query language or database query language (DQL), is a computer language used to make queries in databases and information systems. In database systems, query languages rely on strict theory to retrieve informa ...
is an object-oriented version of Datalog.


Industrial

The early research works on querying the source of software programs spun off a number of industrial applications. In particular it became the cornerstone of systems for application intelligence (
data mining Data mining is the process of extracting and finding patterns in massive data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and ...
on the source of software systems) and software renovation. In 2007,
Paris Paris () is the Capital city, capital and List of communes in France with over 20,000 inhabitants, largest city of France. With an estimated population of 2,048,472 residents in January 2025 in an area of more than , Paris is the List of ci ...
-based
CAST Cast may refer to: Music * Cast (band), an English alternative rock band * Cast (Mexican band), a progressive Mexican rock band * The Cast, a Scottish musical duo: Mairi Campbell and Dave Francis * ''Cast'', a 2012 album by Trespassers William ...
is one of the market leaders in that area, and other significant players include
BluePhoenix BluePhoenix Solutions Ltd. is a publicly traded company, headquartered in Israel, that develops and sells modernization services for legacy information technology systems. Its shares are traded on the NASDAQ Global Market exchange. The company p ...
in
Herzliya Herzliya ( ; , / ) is an affluent List of Israeli cities, city in the Israeli coastal plain, central coast of Israel, at the northern part of the Tel Aviv District, known for its robust start-up and entrepreneurial culture. In it had a populatio ...
,
Israel Israel, officially the State of Israel, is a country in West Asia. It Borders of Israel, shares borders with Lebanon to the north, Syria to the north-east, Jordan to the east, Egypt to the south-west, and the Mediterranean Sea to the west. Isr ...
. SemmleCode differs from these systems in its use of an object-oriented query language, which allows programmers to easily formulate new queries that are particular to their own project. A full account of the academic and industrial developments leading up to the creation of SemmleCode can be found in a paper by Hajiyev et al.Elnar Hajiyev, Mathieu Verbaere, and Oege de Moor, CodeQuest: Scalable Source Code Queries with Datalog. In ''ECOOP 2006: Proceedings of the 2006 European Conference on Object-Oriented Programming'', pages 2–27.
Springer Springer or springers may refer to: Publishers * Springer Science+Business Media, aka Springer International Publishing, a worldwide publishing group founded in 1842 in Germany formerly known as Springer-Verlag. ** Springer Nature, a multinationa ...
, 2006.


Sample query in QL

To illustrate the use of QL, consider the well-known rule in
object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of '' objects''. Objects can contain data (called fields, attributes or properties) and have actions they can perform (called procedures or methods and impl ...
that public fields should be declared final. To find violations of that rule, we should search for fields that are public but not final. In QL, that requirement is expressed as follows: from Field f where f.hasModifier("public") and not(f.hasModifier("final")) select f.getDeclaringType().getPackage(), f.getDeclaringType(), f Here not only is the offending field f selected, but also the package and type in which its declaration occurs.


SemmleCode integration with development environments

SemmleCode provides a
user interface In the industrial design field of human–computer interaction, a user interface (UI) is the space where interactions between humans and machines occur. The goal of this interaction is to allow effective operation and control of the machine fro ...
via the
Eclipse IDE Eclipse is an integrated development environment (IDE) used in computer programming. It contains a base workspace and an extensible plug-in system for customizing the environment. It had been the most popular IDE for Java development until 20 ...
to query Java code (both source code and bytecode) as well as XML files, and to edit QL queries. This is however but one application of the technology that underlies it: QL can be used to query any other type of complex data. As part of the fold into the Microsoft/GitHub corporate house, the original
Eclipse An eclipse is an astronomical event which occurs when an astronomical object or spacecraft is temporarily obscured, by passing into the shadow of another body or by having another body pass between it and the viewer. This alignment of three ...
-based workflow has been supplanted with a workflow based around Microsoft's
Visual Studio Code Visual Studio Code, commonly referred to as VS Code, is an integrated development environment developed by Microsoft for Windows, Linux, macOS and web browsers. Features include support for debugging, syntax highlighting, intelligent code comp ...
.


See also

*
List of tools for static code analysis This is a list of notable tools for static program analysis (program analysis is a synonym for code analysis). Static code analysis tools Languages Ada * * * * * * * * * C, C++ * * Axivion Suite (Bauhaus) * * * ...
* .QL *
Datalog Datalog is a declarative logic programming language. While it is syntactically a subset of Prolog, Datalog generally uses a bottom-up rather than top-down evaluation model. This difference yields significantly different behavior and properties ...


References


Further reading

* Mark A. Linton. Implementing relational views of programs. In Peter B. Henderson, editor, ''Software Development Environments (SDE)'', pages 132–140, 1984.


External links

* {{DEFAULTSORT:Semmle Companies based in Oxford Software companies of the United Kingdom Software testing tools Java development tools Query languages Static program analysis tools