Bytescout.PDFExtractor 5.20.0.1871

dotnet add package Bytescout.PDFExtractor --version 5.20.0.1871
NuGet\Install-Package Bytescout.PDFExtractor -Version 5.20.0.1871
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Bytescout.PDFExtractor" Version="5.20.0.1871" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add Bytescout.PDFExtractor --version 5.20.0.1871
#r "nuget: Bytescout.PDFExtractor, 5.20.0.1871"
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install Bytescout.PDFExtractor as a Cake Addin
#addin nuget:?package=Bytescout.PDFExtractor&version=5.20.0.1871

// Install Bytescout.PDFExtractor as a Cake Tool
#tool nuget:?package=Bytescout.PDFExtractor&version=5.20.0.1871

Bytescout PDF Extractor SDK for .NET 2.00-4.50, ASP.NET, ActiveX

(c) ByteScout 2008-2015

Bytescout PDF Extractor SDK

System Requirements: .NET framework installed
Works with: ASP.NET, .NET, ASP.NET, ActiveX, Visual Basic 6, Delphi, Classic ASP

Benefits:

- Extracts tables in PDF files as CSV, XML, XLS, XLSX;
- Extracts embedded files and attachmentes from PDF;
- Splits and merges PDF documents;
- Extracts text from PDF (from whole page or given rectangle);
- Extracts embedded images from PDF documents;
- Extracts document information from PDF (author, subject, producer etc)
- Detects tables in PDF file
- Searches for text
- Extraction from given coordinates
- Extracts FDF data, supports text restoration using OCR
- and much more!

http://bytescout.com/

Product Compatible and additional computed target framework versions.
.NET Framework net11 is compatible.  net20 is compatible.  net35 is compatible.  net35-client is compatible.  net40 is compatible.  net40-client is compatible.  net403 was computed.  net45 is compatible.  net451 was computed.  net452 was computed.  net46 was computed.  net461 was computed.  net462 was computed.  net463 was computed.  net47 was computed.  net471 was computed.  net472 was computed.  net48 was computed.  net481 was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

This package has no dependencies.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
5.20.0.1871 569 2/5/2015
5.0.0.1626 314 8/14/2014
4.0.0.1487 113 5/31/2014
3.40.0.1349 84 3/11/2014
3.20.0.1092 64 8/5/2013
3.20.0.1075 127 7/12/2013
3.10.0.1051 59 6/29/2013
3.0.0.839 69 3/26/2013
2.50.0.769 60 2/25/2013

Bytescout PDF Extractor SDK for .NET, ASP.NET and ActiveX
------------------

5.20.1781 (January 27, 2015)
PDF to XML, PDF to CSV, PDF to Text functionality improved
OCRMode now provides 9 modes
.DetectLineInsteadOfParagraph now works much better. Set it to False to capture multiline text in table cells!
PDF controls support improved
FDF and XFDF data extraction added
Table detection improved to support multline text in cells and tables with absent rows
beta version of PDF/A validator added
minor fixes and improvements


5.10.1747 (November 25, 2014)
PDF to XML, PDF to CSV, PDF to Text functions improved
now supports text extraction from text controls
XML extractor now adds font style, size, name, text coordinates into <text> tags
ASP.NET sample for OCR usage added
new property OCRLanguageDataFolder to specify the location of "tessdata" folder
improved support of PDF files
improves support for rotated text
updated source code samples
updated documentation
minor improvements and fixes

5.00.1626 (August 14, 2014)
OCR (text from images) functionality added: now you may extract text from embedded images and repair damaged text
issue fixed with CSV and XML extractor missing last columns with some settings
improved support for damaged PDF files
multiline search text search with word matching modes is now supported
now may search text with hyphens and on different lines: see new source code sample Find Text With Hyphens
new property .RTLTextAutoDetectionEnabled (false by default) to auto detect RTL languages
PDF Viewer GUI demo improved
minor improvements and fixes

4.00.1487 (May 30, 2014)
improved pdf to text, pdf to csv, pdf to xml
issue with extraction area fixed
Improved Unicode handling
new .ContentType to check if PDF is PDF, Portfolio or XFAForm
new properties: Unwrap, ExtractionAreaUsageMode
new AttachmentInfo class to obtain details about attachment
new XFA Form XML extraction support (see XFAFormExtractor and XFAFormToXML samples)
new ZuGFeRD PDF support added
Multhithreading performance improved
Licensing updated: Now Licensing is per developer
new "match whole word" parameter to TextExtractor.Find()
improved XLS and XLSX output

3.40.1349 (March 10, 2014)
improved stability of the text extraction
issue with the very last text line missing in some PDF files fixed
tables with empty cells are handled better now
issue with incorrect extraction of overlapped text objects fixed
issue with missing spaces between words in some files fixed
issue with incorrect X coordinate returned while searching with extraction area defined
minor bug-fixes and improvements

3.30.1240 (November 27, 2013)
improved support for old formats PDF files
image flipping issue in some PDF files fixed
improved text rendering in PDF files
minor bug-fixes

3.20.1209 (October 31, 2013)
table detection was not returning proper coordinates for 2nd and further tables, fixed
minor source code samples updates
DocumentSplitter now works with multipage TIF files
minor bug-fixes

3.20.1200 (October 28, 2013)
minor rotated text issues fixed
table detection was not returning proper coordinates, fixed
minor bug-fixes

3.20.1179 (October 22, 2013)
pdf to text and pdf data extraction improved
new .AutoAlignColumnsToHeader (true by default) property to automatically align cells to the header column or not (switching this setting will help if you are getting some shifted cells)
new DocumentRotator class to rotate pages in PDF documents
new ExtractRawImages property in Images Extractor to define if we are extracting raw images or images with rotation and transformation applied
improved support of PDF files with rotated objects and pages
new source code sample showing how to extract page found by a keyword "Find Keyword And Extract Page"
Images Extractor: SetExtractionArea() method added to define a rectangle area to extract images from
improved Splitting Pages example
improved pages extraction from PDF
new RemoveUnusedResources method to remove unused resources from PDF to reduce file size
minor bug-fixes and improvements

3.20.1100 (August 22, 2013)
new method: DocumentSplitter.Split(sourcefile, splitPages) to extract mulitple ranges of pages from the same PDF file
minor bug-fixes in pdf to text engine

3.20.1093 (August 5, 2013)
pdf to text minor functionality fixes
x64 installer improvements
minor fixes for error messages
PDFDocument.Dispose() now not disposing the source stream with PDF if this stream was supplied by the user (so user should dispose it)
improved PDF format support
minor bug-fixes

3.20.1075 (July 11, 2013)
improved PDF To CSV, PDF To XLS, PDF To XML extraction
improved PDF reading speed and stability
minor bug-fixes


3.10.1051 (June 29, 2013)
improved table extraction support
improved pdf files support

3.10.1038 (June 26, 2013)
improved text extraction support
issues fixed related to incorrect extraction area coordinates for some PDF files with scanned images
speed improvements
improved support for various PDF files

3.10.942 (May 30, 2013)
improved pdf text extraction support
minor bug-fixes and improvements

3.10.899 (May 14, 2013)
improved pdf to text conversion
improved PDF reading support
more source Visual Basic .NET, C# and VBScript code samples added
documentation updated

3.00.864 (April 11, 2013)
improved PDF extraction support
improved PDF handling
pdf splitting and merging: new property to optimize PDF files after splitting DocumentSplitter.OptimizeSplittedDocuments may decrease file size when needed
improved PDF fonts handling
demo utility updated
source code samples updated to run on any .NET framework by default
minor bug-fixes


3.00.825 (March 12, 2013)
improved pdf to text, pdf to csv
demo utility PDF Viewer reworked and updated for better UI experience
minor improvements and fixes in PDF support
improved PDF stability while working with PDF files with high density vector graphics inside
improved support for indexed color pallettes
improved embedded fonts rendering
better support for Unicode fonts
new .Version property to read exact version of the dll
minor updates and improvements


2.50.708 (November 11, 2012)
PDF data extraction speed improved
Windows 8 support improved
PDF images and colors support improved
PDF to csv, PDF xml, PDF to xls/xslx now skips first leading rows if they are empty
pdf text search now works better and provides more intelligent support for regular expressions
ActiveX support and installation improved and now provides single batches to run on Windows x86/x64 for Windows XP to 8 Pro
new property: .ExtractShadowLikeText to enable/disable extraction of shadowed text (where it is used as effect to create visual shadows)
minor bug-fixes and improvements


2.40.650 (November 1, 2012)
improved support for Unicode text extraction
improved support for PDF/A pdf files
issues with white stripes appearing on multiple images combined fixed
data extraction internal optimizations
improved support for 8 bit images inside PDF
vector drawings improved to provide better support for multiple small objects
Color representation in images with indexed colors fixed
Type2 fonts support improved
Improved support for embedded fonts in PDF produced by Ghostscript engine
CCIT images compression compression related issues fixed
LZW compressed PDF support improved
improved support for shading objects
improved PDF fonts support
improved support for PDF with 4 bit images


2.30.594 (September 18, 2012)
PDF data extraction improved
memory and speed optimizations
fixing issue with empty data while extracting data from some PDF files
improved images extraction support (more image encoding variations are supported)
minor updates in examples
minor bug-fixes

2.30.568 (June 21, 2012)
pdf to text conversion quality improved
multithreading usage stability has been improved
hanging issue on some PDF fixed
PDF Extractor SDK: updated sample for StructuredExtractor (previously known as TableExtractor interface)
minor fixes and improvements


2.20.0.539 (May 4, 2012)
improved stability
demo utility improved
important security fixes


2.20.525 (April 14, 2012)
improved speed (up to x2 faster on some documents)
Tables detection improved
updated PDF Viewer utility
improved support for structured text extraction (CSV and XML data extraction)
minor bug-fixes

2.20.458 (February 2, 2012)
minor fixes in TableDetector class (.TableDetectionMinNumberOfColumns and .TableDetectionMinNumberOfRows were working incorrectly)
improved text extraction for PDF files generated from text files
improved support for PDF files produced by Adobe Acrobat
PDF Viewer: CSV, XML and Text extractor forms updated to show .PreserveFormattingOnTextExtraction option
minor fixes in .NET 4.0 assemblies
Renderer SDK adds /Visual Basic/PDF To BMP using streams/ sample
improved support for PDF with forms objects
improved leading spaces format detection in text extraction
.SetExtractionArea() added to define area on a page to work with in PDF Renderer SKD
improved fonts information reading support in PDF files
new .PageSeparator property in TextExtractor allowing to define a separator string for pages if you need one
fixing issue with indexed colorspaces in PDF
improved PDF format support

2.20.415 (December 21, 2011)
PDF Extractor SDK: minor update for PDF to XLS sample
rendering: improved fonts support
text extraction with formatting improved
new source code sample to show how to save extracted text to a stream
performance optimized and pdf processing speed improved
improved support for PDF format

2.20.396 (November 30, 2011)
fixing issues with CSV, XML and XLS extraction on long tables
PDF Viewer now provides ability to turn on/off text formatting support on extraction
PDF support improved
minor bug-fixes

2.20.392 (November 25, 2011)
NEW table detection implemented, see new Bytescout.PDFExtractor.TableDetector interface and source code samples in /Find Table And Extract As CSV/ sub-folder in examples
NEW regular expressions support for text search in TextExtractor (see .RegexSearch property)
Text search functionality improved
minor bug-fixes

2.10.303 (October 4, 2011)
NEW: DocumentMerger and DocumentSplitter interfaces and classes to merge and split PDF documents
improved support for PDF documents
PDF processing speed increased
minor bug-fixes

2.10.276 (August 26, 2011)
NEW: AttachmentExtractor interface to extract file attachments and embedded files from PDF (see /Examples/Extract Attachments/ for sample source code)
NEW: XLSExtractor interface to extract tables from PDF as XLS and XLSX Excel files (including font formatting)
improved text extraction functionality
improved output image quality
improved support of Unicode text
improved support of damaged PDF files (not hanging on damaged files anymore)

2.00.228 (12 July 2011)
CSVExtractor: SeparationSymbol and QuotationSymbol properties were added
TrimValues property for CSVExtractor and XMLExtractor: turned on by default to trim detected cell values automatically
Default properties for CSV extraction improved
fixed incorrect default space ratio in text extractor to 0.4, previous value 1.2 was causing to join some words into a single one
TextExtractor.detectNewColumnBySpacesRatio renamed into .SpaceRatioBetweenWords property
PDFViewer now shows options dialog to adjust SpaceRatioBetweenWords if needed
minor bug-fixes

2.00.217 (21 June 2011)
CSV and XML extraction speed greatly improved
CSVExtractor and XMLExtractor classes add new .DetectNewColumnBySpacesRatio property: use this property to control space between detected columns of text
XML and CSV Extractor adds .SkipCellsWithEmptyValues property (true by default to skip cells with empty values)
PDF Viewer now shows extraction options dialog for XML and CSV export functions
PDF To CSV to XLS source code sample added
PDF To CSV\Delphi\ source code sample added
minor bug-fixes and improvements

2.00.206 (6 June 2011)
support for .NET 3.5, .NET 4.00 added
Delphi source code sample has been added
minor bug-fixes and improvements

2.00.186 (May 16, 2011)
pdf processing speed increased up to x10 times
minor bug-fixes and improvements

1.10.168 (May 6 2011)
support for password protected PDF documents improved (was not working properly in previous release)
minor bug-fixes and improvements

1.10.160 (12 April 2011)

XML comments are available now to show hints for methods, classes and properties in Visual Studio
New property: .ExtractColumnByColumn (false default), set to True to extract text column by column instead of line by line
PDF Viewer freeware utility updated to feature "Extract Text (line by line)" and "Extract Text (column by column)" buttons
improved support for single paged PDF documents produced by Acrobat Distiller software
clipping issues were fixed
fixed hanging on some broken PDF documents
improved text decoding support
minor bug-fixes


1.10.150 (10 March 2011)
* PDF files support improved
+ now handles PDF files from Google Doc without errors
* minor bug-fixes

1.10.144 (26 February 2011)
+ now works with secured documents (provide passsword if needed in .Password property)
+ minor bug-fixes and improvements
+ updated GUI demo application

1.10.121 (11 February 2011)
+ PDF to CSV extractor added
+ PDF to XML extractor added
+ support for invisible text extraction added
+ minor bug-fixes and improvements


1.00.30 (9 November 2010)
+ new version