HOME

TheInfoList



OR:

Apache PDFBox is an open source pure-
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
library that can be used to create, render, print, split, merge, alter, verify and extract text and meta-data of PDF files.
Open Hub Black Duck Open Hub, formerly Ohloh, is a website which provides a web services suite and online community platform that aims to index the open-source software development community. It was founded by former Microsoft managers Jason Allen and Sc ...
reports over 11,000 commits (since the start as an Apache project) by 18 contributors representing more than 140,000 lines of code. PDFBox has a well established, mature codebase maintained by an average size development team with increasing year-over-year commits. Using the
COCOMO The Constructive Cost Model (COCOMO) is a procedural software cost estimation model developed by Barry W. Boehm. The model parameters are derived from fitting a regression formula using data from historical projects (63 projects for COCOMO 81 ...
model, it took an estimated 46 person-years of effort.


Structure

Apache PDFBox has these components: * PDFBox: the main part * FontBox: handles font information * XmpBox: handles XMP metadata * Preflight (optional): checks PDF files for PDF/A-1b conformity.


History

PDFBox was started in 2002 in
SourceForge SourceForge is a web service that offers software consumers a centralized online location to control and manage open-source software projects and research business software. It provides source code repository hosting, bug tracking, mirroring ...
by Ben Litchfield who wanted to be able to extract text of PDF files for
Lucene Apache Lucene is a free and open-source search engine software library, originally written in Java by Doug Cutting. It is supported by the Apache Software Foundation and is released under the Apache Software License. Lucene is widely used as ...
. It became an
Apache Incubator Apache Incubator is the gateway for open-source projects intended to become fully fledged Apache Software Foundation projects. The Incubator project was created in October 2002 to provide an entry path to the Apache Software Foundation for projec ...
project in 2008, and an Apache top level project in 2009. Preflight was originally named PaDaF and developed by Atos worldline, and donated to the project in 2011. In February 2015, Apache PDFBox was named an Open Source Partner Organization of the
PDF Association The PDF Association promotes the adoption and use of International Standards related to PDF technology by assisting enterprise content management (ECM), document management system (DMS) and advanced PDF users with the implementation of PDF techn ...
.Apache™ PDFBox™ named an Open Source Partner Organization of the PDF Association
February 3, 2015


See also

*
List of PDF software This is a list of links to articles on software used to manage Portable Document Format (PDF) documents. The distinction between the various functions is not entirely clear-cut; for example, some viewers allow adding of annotations, signatures, e ...


References


External links


Apache PDFBox Project
{{Apache Software Foundation PDFBox Free PDF software Free software programmed in Java (programming language) Java (programming language) libraries Java platform Software using the Apache license