Apache PDFBox

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
PDFBox
DeveloperApache Software Foundation
Stable release
1.8.x:1.8.17 / 15 September 2022; 3 years ago (2022-09-15)[1]
2.0.x:2.0.32 / 24 July 2024; 20 months ago (2024-07-24)[1]
3.0.x:3.0.3 / 8 August 2024; 20 months ago (2024-08-08)[1]
RepositoryPDFBox Repository (Mirror)
Written inJava
Engine
    Lua error in Module:EditAtWikidata at line 29: attempt to index field 'wikibase' (a nil value).
    Operating systemCross-platform
    TypePortable Document Format (PDF)
    LicenseApache License 2.0
    Websitepdfbox.apache.org

    Apache PDFBox is an open source pure-Java library that can be used to create, render, print, split, merge, alter, verify and extract text and meta-data of PDF files.

    Open Hub reports over 11,000 commits (since the start as an Apache project) by 18 contributors representing more than 140,000 lines of code. PDFBox has a well established, mature codebase maintained by an average size development team with increasing year-over-year commits. Using the COCOMO model, it took an estimated 46 person-years of effort.[2]

    Structure

    [edit | edit source]

    Apache PDFBox has these components:

    • PDFBox: the main part
    • FontBox: handles font information
    • XmpBox: handles XMP metadata
    • Preflight (optional): checks PDF files for PDF/A-1b conformity.

    History

    [edit | edit source]

    PDFBox was started in 2002 in SourceForge by Ben Litchfield who wanted to be able to extract text of PDF files for Lucene.[3] It became an Apache Incubator project in 2008, and an Apache top level project in 2009.[4]

    Preflight was originally named PaDaF and developed by Atos worldline, and donated to the project in 2011.[5]

    In February 2015, Apache PDFBox was named an Open Source Partner Organization of the PDF Association.[6]

    See also

    [edit | edit source]

    Lua error in mw.title.lua at line 392: bad argument #2 to 'title.new' (unrecognized namespace name 'Portal').

    References

    [edit | edit source]
    1. ^ a b c Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
    2. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
    3. ^ Apache PDFBox and FontBox 1.0.0 released, The H Open, 16 February 2010
    4. ^ PDFBox Project Incubation Status
    5. ^ PaDaF Preflight Codebase Intellectual Property (IP) Clearance Status
    6. ^ Apache™ PDFBox™ named an Open Source Partner Organization of the PDF Association, February 3, 2015
    [edit | edit source]