diff --git a/.gitignore b/.gitignore index 19a1d30..a72f3dd 100644 --- a/.gitignore +++ b/.gitignore @@ -1,5 +1,454 @@ -bin -obj -*.user -*.suo -*.nupkg +## Ignore Visual Studio temporary files, build results, and +## files generated by popular Visual Studio add-ons. +## +## Get latest from https://github.com/github/gitignore/blob/master/VisualStudio.gitignore + +# User-specific files +*.rsuser +*.suo +*.user +*.userosscache +*.sln.docstates + +# User-specific files (MonoDevelop/Xamarin Studio) +*.userprefs + +# Mono auto generated files +mono_crash.* + +# Build results +[Dd]ebug/ +[Dd]ebugPublic/ +[Rr]elease/ +[Rr]eleases/ +x64/ +x86/ +[Ww][Ii][Nn]32/ +[Aa][Rr][Mm]/ +[Aa][Rr][Mm]64/ +bld/ +[Bb]in/ +[Oo]bj/ +[Ll]og/ +[Ll]ogs/ + +# Visual Studio 2015/2017 cache/options directory +.vs/ +# Uncomment if you have tasks that create the project's static files in wwwroot +#wwwroot/ + +# Visual Studio 2017 auto generated files +Generated\ Files/ + +# MSTest test Results +[Tt]est[Rr]esult*/ +[Bb]uild[Ll]og.* + +# NUnit +*.VisualState.xml +TestResult.xml +nunit-*.xml + +# Build Results of an ATL Project +[Dd]ebugPS/ +[Rr]eleasePS/ +dlldata.c + +# Benchmark Results +BenchmarkDotNet.Artifacts/ + +# .NET +project.lock.json +project.fragment.lock.json +artifacts/ + +# Tye +.tye/ + +# ASP.NET Scaffolding +ScaffoldingReadMe.txt + +# StyleCop +StyleCopReport.xml + +# Files built by Visual Studio +*_i.c +*_p.c +*_h.h +*.ilk +*.meta +*.obj +*.iobj +*.pch +*.pdb +*.ipdb +*.pgc +*.pgd +*.rsp +*.sbr +*.tlb +*.tli +*.tlh +*.tmp +*.tmp_proj +*_wpftmp.csproj +*.log +*.vspscc +*.vssscc +.builds +*.pidb +*.svclog +*.scc + +# Chutzpah Test files +_Chutzpah* + +# Visual C++ cache files +ipch/ +*.aps +*.ncb +*.opendb +*.opensdf +*.sdf +*.cachefile +*.VC.db +*.VC.VC.opendb + +# Visual Studio profiler +*.psess +*.vsp +*.vspx +*.sap + +# Visual Studio Trace Files +*.e2e + +# TFS 2012 Local Workspace +$tf/ + +# Guidance Automation Toolkit +*.gpState + +# ReSharper is a .NET coding add-in +_ReSharper*/ +*.[Rr]e[Ss]harper +*.DotSettings.user + +# TeamCity is a build add-in +_TeamCity* + +# DotCover is a Code Coverage Tool +*.dotCover + +# AxoCover is a Code Coverage Tool +.axoCover/* +!.axoCover/settings.json + +# Coverlet is a free, cross platform Code Coverage Tool +coverage*.json +coverage*.xml +coverage*.info + +# Visual Studio code coverage results +*.coverage +*.coveragexml + +# NCrunch +_NCrunch_* +.*crunch*.local.xml +nCrunchTemp_* + +# MightyMoose +*.mm.* +AutoTest.Net/ + +# Web workbench (sass) +.sass-cache/ + +# Installshield output folder +[Ee]xpress/ + +# DocProject is a documentation generator add-in +DocProject/buildhelp/ +DocProject/Help/*.HxT +DocProject/Help/*.HxC +DocProject/Help/*.hhc +DocProject/Help/*.hhk +DocProject/Help/*.hhp +DocProject/Help/Html2 +DocProject/Help/html + +# Click-Once directory +publish/ + +# Publish Web Output +*.[Pp]ublish.xml +*.azurePubxml +# Note: Comment the next line if you want to checkin your web deploy settings, +# but database connection strings (with potential passwords) will be unencrypted +*.pubxml +*.publishproj + +# Microsoft Azure Web App publish settings. Comment the next line if you want to +# checkin your Azure Web App publish settings, but sensitive information contained +# in these scripts will be unencrypted +PublishScripts/ + +# NuGet Packages +*.nupkg +# NuGet Symbol Packages +*.snupkg +# The packages folder can be ignored because of Package Restore +**/[Pp]ackages/* +# except build/, which is used as an MSBuild target. +!**/[Pp]ackages/build/ +# Uncomment if necessary however generally it will be regenerated when needed +#!**/[Pp]ackages/repositories.config +# NuGet v3's project.json files produces more ignorable files +*.nuget.props +*.nuget.targets + +# Microsoft Azure Build Output +csx/ +*.build.csdef + +# Microsoft Azure Emulator +ecf/ +rcf/ + +# Windows Store app package directories and files +AppPackages/ +BundleArtifacts/ +Package.StoreAssociation.xml +_pkginfo.txt +*.appx +*.appxbundle +*.appxupload + +# Visual Studio cache files +# files ending in .cache can be ignored +*.[Cc]ache +# but keep track of directories ending in .cache +!?*.[Cc]ache/ + +# Others +ClientBin/ +~$* +*~ +*.dbmdl +*.dbproj.schemaview +*.jfm +*.pfx +*.publishsettings +orleans.codegen.cs + +# Including strong name files can present a security risk +# (https://github.com/github/gitignore/pull/2483#issue-259490424) +#*.snk + +# Since there are multiple workflows, uncomment next line to ignore bower_components +# (https://github.com/github/gitignore/pull/1529#issuecomment-104372622) +#bower_components/ + +# RIA/Silverlight projects +Generated_Code/ + +# Backup & report files from converting an old project file +# to a newer Visual Studio version. Backup files are not needed, +# because we have git ;-) +_UpgradeReport_Files/ +Backup*/ +UpgradeLog*.XML +UpgradeLog*.htm +ServiceFabricBackup/ +*.rptproj.bak + +# SQL Server files +*.mdf +*.ldf +*.ndf + +# Business Intelligence projects +*.rdl.data +*.bim.layout +*.bim_*.settings +*.rptproj.rsuser +*- [Bb]ackup.rdl +*- [Bb]ackup ([0-9]).rdl +*- [Bb]ackup ([0-9][0-9]).rdl + +# Microsoft Fakes +FakesAssemblies/ + +# GhostDoc plugin setting file +*.GhostDoc.xml + +# Node.js Tools for Visual Studio +.ntvs_analysis.dat +node_modules/ + +# Visual Studio 6 build log +*.plg + +# Visual Studio 6 workspace options file +*.opt + +# Visual Studio 6 auto-generated workspace file (contains which files were open etc.) +*.vbw + +# Visual Studio LightSwitch build output +**/*.HTMLClient/GeneratedArtifacts +**/*.DesktopClient/GeneratedArtifacts +**/*.DesktopClient/ModelManifest.xml +**/*.Server/GeneratedArtifacts +**/*.Server/ModelManifest.xml +_Pvt_Extensions + +# Paket dependency manager +.paket/paket.exe +paket-files/ + +# FAKE - F# Make +.fake/ + +# CodeRush personal settings +.cr/personal + +# Python Tools for Visual Studio (PTVS) +__pycache__/ +*.pyc + +# Cake - Uncomment if you are using it +# tools/** +# !tools/packages.config + +# Tabs Studio +*.tss + +# Telerik's JustMock configuration file +*.jmconfig + +# BizTalk build output +*.btp.cs +*.btm.cs +*.odx.cs +*.xsd.cs + +# OpenCover UI analysis results +OpenCover/ + +# Azure Stream Analytics local run output +ASALocalRun/ + +# MSBuild Binary and Structured Log +*.binlog + +# NVidia Nsight GPU debugger configuration file +*.nvuser + +# MFractors (Xamarin productivity tool) working folder +.mfractor/ + +# Local History for Visual Studio +.localhistory/ + +# BeatPulse healthcheck temp database +healthchecksdb + +# Backup folder for Package Reference Convert tool in Visual Studio 2017 +MigrationBackup/ + +# Ionide (cross platform F# VS Code tools) working folder +.ionide/ + +# Fody - auto-generated XML schema +FodyWeavers.xsd + +## +## Visual studio for Mac +## + + +# globs +Makefile.in +*.userprefs +*.usertasks +config.make +config.status +aclocal.m4 +install-sh +autom4te.cache/ +*.tar.gz +tarballs/ +test-results/ + +# Mac bundle stuff +*.dmg +*.app + +# content below from: https://github.com/github/gitignore/blob/master/Global/macOS.gitignore +# General +.DS_Store +.AppleDouble +.LSOverride + +# Icon must end with two \r +Icon + + +# Thumbnails +._* + +# Files that might appear in the root of a volume +.DocumentRevisions-V100 +.fseventsd +.Spotlight-V100 +.TemporaryItems +.Trashes +.VolumeIcon.icns +.com.apple.timemachine.donotpresent + +# Directories potentially created on remote AFP share +.AppleDB +.AppleDesktop +Network Trash Folder +Temporary Items +.apdisk + +# content below from: https://github.com/github/gitignore/blob/master/Global/Windows.gitignore +# Windows thumbnail cache files +Thumbs.db +ehthumbs.db +ehthumbs_vista.db + +# Dump file +*.stackdump + +# Folder config file +[Dd]esktop.ini + +# Recycle Bin used on file shares +$RECYCLE.BIN/ + +# Windows Installer files +*.cab +*.msi +*.msix +*.msm +*.msp + +# Windows shortcuts +*.lnk + +# JetBrains Rider +.idea/ +*.sln.iml + +## +## Visual Studio Code +## +.vscode/* +!.vscode/settings.json +!.vscode/tasks.json +!.vscode/launch.json +!.vscode/extensions.json diff --git a/PDFSharp.Extensions.Sample/PDFSharp.Extensions.Sample.csproj b/PDFSharp.Extensions.Sample/PDFSharp.Extensions.Sample.csproj new file mode 100644 index 0000000..8b8f356 --- /dev/null +++ b/PDFSharp.Extensions.Sample/PDFSharp.Extensions.Sample.csproj @@ -0,0 +1,24 @@ + + + + Exe + net6.0 + + + + + + + + + Always + + + + + + Always + + + + diff --git a/PDFSharp.Extensions.Sample/Program.cs b/PDFSharp.Extensions.Sample/Program.cs new file mode 100644 index 0000000..9d171de --- /dev/null +++ b/PDFSharp.Extensions.Sample/Program.cs @@ -0,0 +1,103 @@ +using System; +using SixLabors.ImageSharp; +using System.IO; +using System.Linq; +using PdfSharp.Pdf; +using PdfSharp.Pdf.Drawing; +using PdfSharpCore.Pdf; +using PdfSharpCore.Pdf.IO; + +namespace PDFSharp.Extensions.Sample +{ + internal static class Program + { + private static void Main(string[] args) + { + var root = args.FirstOrDefault() ?? Path.Combine("res"); + var output = Directory.CreateDirectory("out").Name; + const SearchOption o = SearchOption.AllDirectories; + var files = Directory.GetFiles(root, "*.pdf", o); + foreach (var file in files) + try + { + ExtractImages(output, file); + } + catch (Exception e) + { + Console.Error.WriteLine($"{file} -> {e.Message}"); + } + + var doc = files.FirstOrDefault(f => f.Contains("sample")); + ExtractDocument(output, doc); + + var imgs = Directory.GetFiles(root, "*.png", o); + CombineImages(output, imgs); + } + + private static void CombineImages(string output, string[] filenames) + { + var images = filenames.Select(Image.Load).ToArray(); + + var name = Path.GetFileNameWithoutExtension(filenames.First()); + var path = Path.Combine(output, $"{name}.pdf"); + using (PdfDocument pdf = images.First().ToPdf()) + pdf.Save(path); + + name = Path.GetFileNameWithoutExtension(filenames.Last()); + path = Path.Combine(output, $"{name}.pdf"); + using (PdfDocument pdf = images.ToPdf()) + pdf.Save(path); + } + + private static void ExtractDocument(string output, string filename) + { + using (var document = PdfReader.Open(filename, PdfDocumentOpenMode.Import)) + { + using var image = document.GetImages().Single(); + var name = Path.GetFileNameWithoutExtension(filename); + var path = Path.Combine(output, $"{name}.png"); + image.SaveAsPng(path); + + var page = document.Pages[0]; + + var elements = page.Elements; + var array = elements.Values.OfType().Single(); + Console.WriteLine($" {nameof(PdfArray)} " + + $"{nameof(PdfArrayExtensions.IsEmpty)} " + + $"= {array.IsEmpty()}"); + array.Dump(); + + var resources = page.Resources.Elements; + var dict = resources.Values.OfType().First(); + dict.Dump(); + } + } + + private static void ExtractImages(string output, string filename) + { + Console.WriteLine("Processing file: {0}", filename); + + using (var document = PdfReader.Open(filename, PdfDocumentOpenMode.Import)) + { + var pageIndex = 0; + foreach (PdfPage page in document.Pages) + { + var imageIndex = 0; + foreach (var image in page.GetImages()) + { + var currPage = pageIndex + 1; + var currImg = imageIndex + 1; + Console.WriteLine("\r\nExtracting image {1} from page {0}", currPage, currImg); + + var pre = Path.GetFileNameWithoutExtension(filename); + var path = string.Format(@"{2} {0:00000000}-{1:000}.png", currPage, currImg, pre); + path = Path.Combine(output, path); + image.SaveAsPng(path); + imageIndex++; + } + pageIndex++; + } + } + } + } +} \ No newline at end of file diff --git a/PDFSharp.Extensions.Sample/res/CCITTFax/ICC1v42_2006-05.pdf b/PDFSharp.Extensions.Sample/res/CCITTFax/ICC1v42_2006-05.pdf new file mode 100755 index 0000000..8ecd126 Binary files /dev/null and b/PDFSharp.Extensions.Sample/res/CCITTFax/ICC1v42_2006-05.pdf differ diff --git a/PDFSharp.Extensions.Sample/res/CCITTFax/PDFlib-tutorial.pdf b/PDFSharp.Extensions.Sample/res/CCITTFax/PDFlib-tutorial.pdf new file mode 100755 index 0000000..20d35b9 Binary files /dev/null and b/PDFSharp.Extensions.Sample/res/CCITTFax/PDFlib-tutorial.pdf differ diff --git a/PDFSharp.Extensions.Sample/res/DCT/Created.pdf b/PDFSharp.Extensions.Sample/res/DCT/Created.pdf new file mode 100644 index 0000000..f91d7d2 Binary files /dev/null and b/PDFSharp.Extensions.Sample/res/DCT/Created.pdf differ diff --git a/PDFSharp.Extensions.Sample/res/DCT/Sample.pdf b/PDFSharp.Extensions.Sample/res/DCT/Sample.pdf new file mode 100644 index 0000000..1c372b2 Binary files /dev/null and b/PDFSharp.Extensions.Sample/res/DCT/Sample.pdf differ diff --git a/PDFSharp.Extensions.Sample/res/DCT/issue_28.pdf b/PDFSharp.Extensions.Sample/res/DCT/issue_28.pdf new file mode 100755 index 0000000..87f133b Binary files /dev/null and b/PDFSharp.Extensions.Sample/res/DCT/issue_28.pdf differ diff --git a/PDFSharp.Extensions.Sample/res/Flate/OTGuide.pdf b/PDFSharp.Extensions.Sample/res/Flate/OTGuide.pdf new file mode 100755 index 0000000..bf0dceb Binary files /dev/null and b/PDFSharp.Extensions.Sample/res/Flate/OTGuide.pdf differ diff --git a/PDFSharp.Extensions.Sample/res/Flate/pdfbox_webpage.pdf b/PDFSharp.Extensions.Sample/res/Flate/pdfbox_webpage.pdf new file mode 100644 index 0000000..c0d1385 Binary files /dev/null and b/PDFSharp.Extensions.Sample/res/Flate/pdfbox_webpage.pdf differ diff --git a/PDFSharp.Extensions.Sample/res/Flate/sample_fonts_solidconvertor.pdf b/PDFSharp.Extensions.Sample/res/Flate/sample_fonts_solidconvertor.pdf new file mode 100644 index 0000000..b7d0977 Binary files /dev/null and b/PDFSharp.Extensions.Sample/res/Flate/sample_fonts_solidconvertor.pdf differ diff --git a/PDFSharp.Extensions.Sample/res/New/example-1.png b/PDFSharp.Extensions.Sample/res/New/example-1.png new file mode 100644 index 0000000..8614a4e Binary files /dev/null and b/PDFSharp.Extensions.Sample/res/New/example-1.png differ diff --git a/PDFSharp.Extensions.Sample/res/New/example-2.png b/PDFSharp.Extensions.Sample/res/New/example-2.png new file mode 100644 index 0000000..7fd7f41 Binary files /dev/null and b/PDFSharp.Extensions.Sample/res/New/example-2.png differ diff --git a/PDFSharp.Extensions.csproj b/PDFSharp.Extensions.csproj deleted file mode 100644 index 635a1c0..0000000 --- a/PDFSharp.Extensions.csproj +++ /dev/null @@ -1,66 +0,0 @@ - - - - Debug - AnyCPU - 8.0.30703 - 2.0 - {35910C2C-3D6D-4912-9584-4DE66C294729} - Library - Properties - PDFSharp.Extensions - PDFSharp.Extensions - v4.0 - 512 - - - true - full - false - bin\Debug\ - DEBUG;TRACE - prompt - 4 - - - pdbonly - true - bin\Release\ - TRACE - prompt - 4 - - - - packages\BitMiracle.LibTiff.NET.2.3.642.0\lib\net20\BitMiracle.LibTiff.NET.dll - - - packages\PDFsharp.1.32.3057.0\lib\net20\PdfSharp.dll - - - - - - - - - - - - - - - - - - - - - - \ No newline at end of file diff --git a/PDFSharp.Extensions.nuspec b/PDFSharp.Extensions.nuspec deleted file mode 100644 index 588c7a6..0000000 --- a/PDFSharp.Extensions.nuspec +++ /dev/null @@ -1,23 +0,0 @@ - - - - PDFSharp.Extensions - 0.1.2.2 - PDFSharp Extension Methods - George Heeres - https://github.com/gheeres/PDFSharp.Extensions/blob/master/LICENSE - https://github.com/gheeres/PDFSharp.Extensions - false - Extension methods for PDFSharp to support and simplify some common operations including image extraction. - Initial release. - Copyright 2014 - pdf pdfsharp image extension - - - - - - - - - \ No newline at end of file diff --git a/PDFSharp.Extensions.sln b/PDFSharp.Extensions.sln index 0adc13a..7b97ba3 100644 --- a/PDFSharp.Extensions.sln +++ b/PDFSharp.Extensions.sln @@ -1,20 +1,28 @@ - -Microsoft Visual Studio Solution File, Format Version 12.00 -# Visual Studio 2012 -Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "PDFSharp.Extensions", "PDFSharp.Extensions.csproj", "{35910C2C-3D6D-4912-9584-4DE66C294729}" -EndProject -Global - GlobalSection(SolutionConfigurationPlatforms) = preSolution - Debug|Any CPU = Debug|Any CPU - Release|Any CPU = Release|Any CPU - EndGlobalSection - GlobalSection(ProjectConfigurationPlatforms) = postSolution - {35910C2C-3D6D-4912-9584-4DE66C294729}.Debug|Any CPU.ActiveCfg = Debug|Any CPU - {35910C2C-3D6D-4912-9584-4DE66C294729}.Debug|Any CPU.Build.0 = Debug|Any CPU - {35910C2C-3D6D-4912-9584-4DE66C294729}.Release|Any CPU.ActiveCfg = Release|Any CPU - {35910C2C-3D6D-4912-9584-4DE66C294729}.Release|Any CPU.Build.0 = Release|Any CPU - EndGlobalSection - GlobalSection(SolutionProperties) = preSolution - HideSolutionNode = FALSE - EndGlobalSection -EndGlobal + +Microsoft Visual Studio Solution File, Format Version 12.00 +# Visual Studio Version 17 +VisualStudioVersion = 17.0.31903.59 +MinimumVisualStudioVersion = 10.0.40219.1 +Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "PDFSharp.Extensions", "PDFSharp.Extensions\PDFSharp.Extensions.csproj", "{3D172E2D-785B-4CC1-BF95-CAD6AD9BDC41}" +EndProject +Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "PDFSharp.Extensions.Sample", "PDFSharp.Extensions.Sample\PDFSharp.Extensions.Sample.csproj", "{FB0B4F48-BE1D-4B26-BE00-A9A21F2820D5}" +EndProject +Global + GlobalSection(SolutionConfigurationPlatforms) = preSolution + Debug|Any CPU = Debug|Any CPU + Release|Any CPU = Release|Any CPU + EndGlobalSection + GlobalSection(SolutionProperties) = preSolution + HideSolutionNode = FALSE + EndGlobalSection + GlobalSection(ProjectConfigurationPlatforms) = postSolution + {3D172E2D-785B-4CC1-BF95-CAD6AD9BDC41}.Debug|Any CPU.ActiveCfg = Debug|Any CPU + {3D172E2D-785B-4CC1-BF95-CAD6AD9BDC41}.Debug|Any CPU.Build.0 = Debug|Any CPU + {3D172E2D-785B-4CC1-BF95-CAD6AD9BDC41}.Release|Any CPU.ActiveCfg = Release|Any CPU + {3D172E2D-785B-4CC1-BF95-CAD6AD9BDC41}.Release|Any CPU.Build.0 = Release|Any CPU + {FB0B4F48-BE1D-4B26-BE00-A9A21F2820D5}.Debug|Any CPU.ActiveCfg = Debug|Any CPU + {FB0B4F48-BE1D-4B26-BE00-A9A21F2820D5}.Debug|Any CPU.Build.0 = Debug|Any CPU + {FB0B4F48-BE1D-4B26-BE00-A9A21F2820D5}.Release|Any CPU.ActiveCfg = Release|Any CPU + {FB0B4F48-BE1D-4B26-BE00-A9A21F2820D5}.Release|Any CPU.Build.0 = Release|Any CPU + EndGlobalSection +EndGlobal diff --git a/PDFSharp.Extensions/PDFSharp.Extensions.csproj b/PDFSharp.Extensions/PDFSharp.Extensions.csproj new file mode 100644 index 0000000..d2070fd --- /dev/null +++ b/PDFSharp.Extensions/PDFSharp.Extensions.csproj @@ -0,0 +1,29 @@ + + + + net6.0 + + PDFSharp.Extensions + PdfSharpCore.Extensions + 0.1.2.4 + Open Inventions + Open Inventions + https://github.com/Open-Inventions/PDFSharp.Extensions/blob/master/LICENSE + https://github.com/Open-Inventions/PDFSharp.Extensions + https://github.com/Open-Inventions/PDFSharp.Extensions + MIT + true + Extension methods for PdfSharpCore to support and simplify some common operations including image extraction. + Initial release. + Copyright © 2014, George Heeres (gheeres@gmail.com) + 2022, OI + Licensed under MIT + pdf, pdfsharp, PdfSharpCore, image, extension + + + + + + + + + diff --git a/Pdf/PdfArrayExtensions.cs b/PDFSharp.Extensions/Pdf/PdfArrayExtensions.cs similarity index 96% rename from Pdf/PdfArrayExtensions.cs rename to PDFSharp.Extensions/Pdf/PdfArrayExtensions.cs index dac70b8..a6f8f1d 100644 --- a/Pdf/PdfArrayExtensions.cs +++ b/PDFSharp.Extensions/Pdf/PdfArrayExtensions.cs @@ -1,4 +1,5 @@ -using System; +using PdfSharpCore.Pdf; +using System; // ReSharper disable once CheckNamespace namespace PdfSharp.Pdf diff --git a/Pdf/PdfDictionaryExtensions.cs b/PDFSharp.Extensions/Pdf/PdfDictionaryExtensions.cs similarity index 89% rename from Pdf/PdfDictionaryExtensions.cs rename to PDFSharp.Extensions/Pdf/PdfDictionaryExtensions.cs index 23b203f..f47e960 100644 --- a/Pdf/PdfDictionaryExtensions.cs +++ b/PDFSharp.Extensions/Pdf/PdfDictionaryExtensions.cs @@ -1,17 +1,19 @@ using System; using System.Collections.Generic; -using System.Diagnostics; -using System.Drawing; -using System.Drawing.Imaging; +using SixLabors.ImageSharp; +using SixLabors.ImageSharp.PixelFormats; +using SixLabors.ImageSharp.Processing; using System.Globalization; using System.IO; using System.Linq; using System.Reflection; -using System.Runtime.InteropServices; using System.Threading.Tasks; using BitMiracle.LibTiff.Classic; -using PdfSharp.Pdf.Advanced; -using PdfSharp.Pdf.Filters; +using PDFSharp.Extensions.Pdf; +using PdfSharp.Pdf.Drawing; +using PdfSharpCore.Pdf.Advanced; +using PdfSharpCore.Pdf.Filters; +using PdfSharpCore.Pdf; // ReSharper disable once CheckNamespace namespace PdfSharp.Pdf @@ -26,7 +28,7 @@ public static class PdfDictionaryExtensions /// /// The dictionary to dump. /// The optional output method. If not provided, then the output will be directed to standard output. - private static void Dump(this PdfDictionary dictionary, Action output = null) + public static void Dump(this PdfDictionary dictionary, Action output = null) { if (dictionary == null) return; // If not output method was specified, write to the console. @@ -59,7 +61,7 @@ private static Image ImageFromDCTDecode(PdfDictionary dictionary) // DCTDecode a lossy filter based on the JPEG standard // We can just load directly from the stream. MemoryStream stream = new MemoryStream(dictionary.Stream.Value); - return (Bitmap.FromStream(stream)); + return (Image.Load(stream)); } /// @@ -133,7 +135,7 @@ private static byte[] GetTiffImageBufferFromCCITTFaxDecode(PdfDictionaryImageMet WriteTiffTag(buffer, TiffTag.IMAGEWIDTH, TiffType.LONG, 1, (uint)imageData.Width); WriteTiffTag(buffer, TiffTag.IMAGELENGTH, TiffType.LONG, 1, (uint)imageData.Height); WriteTiffTag(buffer, TiffTag.BITSPERSAMPLE, TiffType.SHORT, 1, (uint)imageData.BitsPerPixel); - WriteTiffTag(buffer, TiffTag.COMPRESSION, TiffType.SHORT, 1, (uint) Compression.CCITTFAX4); // CCITT Group 4 fax encoding. + WriteTiffTag(buffer, TiffTag.COMPRESSION, TiffType.SHORT, 1, (uint)imageData.Compression); WriteTiffTag(buffer, TiffTag.PHOTOMETRIC, TiffType.SHORT, 1, 0); // WhiteIsZero WriteTiffTag(buffer, TiffTag.STRIPOFFSETS, TiffType.LONG, 1, header_length); WriteTiffTag(buffer, TiffTag.SAMPLESPERPIXEL, TiffType.SHORT, 1, 1); @@ -155,34 +157,47 @@ private static byte[] GetTiffImageBufferFromCCITTFaxDecode(PdfDictionaryImageMet /// The image retrieve from the dictionary. If not found or an invalid image, then null is returned. private static Image ImageFromCCITTFaxDecode(PdfDictionary dictionary) { - Image image = null; PdfDictionaryImageMetaData imageData = new PdfDictionaryImageMetaData(dictionary); PixelFormat format = GetPixelFormat(imageData.ColorSpace, imageData.BitsPerPixel, true); - Bitmap bitmap = new Bitmap(imageData.Width, imageData.Height, format); // Determine if BLACK=1, create proper indexed color palette. - CCITTFaxDecodeParameters ccittFaxDecodeParameters = new CCITTFaxDecodeParameters(dictionary.Elements["/DecodeParms"].Get() as PdfDictionary); - if (ccittFaxDecodeParameters.BlackIs1) bitmap.Palette = PdfIndexedColorSpace.CreateColorPalette(Color.Black, Color.White); - else bitmap.Palette = PdfIndexedColorSpace.CreateColorPalette(Color.White, Color.Black); + + PdfDictionary decodeParams; + var decodeParamsObject = dictionary.Elements["/DecodeParms"].Get(); + if (decodeParamsObject is PdfArray) + decodeParams = (decodeParamsObject as PdfArray).First() as PdfDictionary; + else if (decodeParamsObject is PdfDictionary) + decodeParams = decodeParamsObject as PdfDictionary; + else + throw new NotSupportedException("Unknown format of CCITTFaxDecode params."); + + CCITTFaxDecodeParameters ccittFaxDecodeParameters = new CCITTFaxDecodeParameters(decodeParams); + + if (ccittFaxDecodeParameters.K == 0 || ccittFaxDecodeParameters.K > 0) + imageData.Compression = Compression.CCITTFAX3; + else if (ccittFaxDecodeParameters.K < 0) + imageData.Compression = Compression.CCITTFAX4; using (MemoryStream stream = new MemoryStream(GetTiffImageBufferFromCCITTFaxDecode(imageData, dictionary.Stream.Value))) { using (Tiff tiff = Tiff.ClientOpen("", "r", stream, new TiffStream())) { if (tiff == null) return (null); - - int stride = tiff.ScanlineSize(); - byte[] buffer = new byte[stride]; - for (int i = 0; i < imageData.Height; i++) { - tiff.ReadScanline(buffer, i); - - Rectangle imgRect = new Rectangle(0, i, imageData.Width, 1); - BitmapData imgData = bitmap.LockBits(imgRect, ImageLockMode.WriteOnly, PixelFormat.Format1bppIndexed); - Marshal.Copy(buffer, 0, imgData.Scan0, buffer.Length); - bitmap.UnlockBits(imgData); + + var raster = new int[imageData.Width * imageData.Height]; + if (!tiff.ReadRGBAImageOriented(imageData.Width, imageData.Height, raster, Orientation.TOPLEFT)) + throw new InvalidOperationException("Cannot read image into raster!"); + + var pixels = raster.Select(i => i.FromPacked()).ToArray(); + var bitmap = Image.LoadPixelData(pixels, imageData.Width, imageData.Height); + + if (!ccittFaxDecodeParameters.BlackIs1) + { + bitmap.Mutate(c => c.Invert()); } + + return (bitmap); } } - return (bitmap); } /// @@ -204,47 +219,37 @@ private static Image ImageFromFlateDecode(PdfDictionary dictionary) bool isIndexed = imageData.ColorSpace.IsIndexed; PixelFormat format = GetPixelFormat(imageData.ColorSpace, imageData.BitsPerPixel, isIndexed); - Bitmap bitmap = new Bitmap(imageData.Width, imageData.Height, format); + ColorPalette palette = default; // If indexed, retrieve and assign the color palette for the item. - if ((isIndexed) && (imageData.ColorSpace.IsRGB)) bitmap.Palette = ((PdfIndexedRGBColorSpace) imageData.ColorSpace).ToColorPalette(); - else if (imageData.ColorSpace is PdfGrayColorSpace) bitmap.Palette = ((PdfGrayColorSpace)imageData.ColorSpace).ToColorPalette(imageData.BitsPerPixel); - - // If not an indexed color, the .NET image component expects pixels to be in BGR order. However, our PDF stream is in RGB order. - byte[] stream = (format == PixelFormat.Format24bppRgb) ? ConvertRGBStreamToBGR(dictionary.Stream.UnfilteredValue) : dictionary.Stream.UnfilteredValue; - - BitmapData bitmapData = bitmap.LockBits(new Rectangle(0, 0, imageData.Width, imageData.Height), ImageLockMode.WriteOnly, format); - // We can't just copy the bytes directly; the BitmapData .NET class has a stride (padding) associated with it. - int bitsPerPixel = ((((int)format >> 8) & 0xFF)); - int length = (int)Math.Ceiling(bitmapData.Width * bitsPerPixel / 8.0); - for (int y = 0, height = bitmapData.Height; y < height; y++) { - int offset = y * length; - Marshal.Copy(stream, offset, bitmapData.Scan0 + (y * bitmapData.Stride), length); - } - bitmap.UnlockBits(bitmapData); - - return (bitmap); - } + if ((isIndexed) && (imageData.ColorSpace.IsRGB)) palette = ((PdfIndexedRGBColorSpace) imageData.ColorSpace).ToColorPalette(); + else if (imageData.ColorSpace is PdfGrayColorSpace) palette = ((PdfGrayColorSpace)imageData.ColorSpace).ToColorPalette(imageData.BitsPerPixel); - /// - /// Converts an RGB ordered stream to BGR ordering. - /// - /// - /// A PDF /DeviceRGB stream is stored in RGB ordering, however the .NET Image libraries expect BGR ordering. - /// - /// The input stream to reorder. The input array will be modified inline by this procedure. - /// Return the modified input stream. - private static byte[] ConvertRGBStreamToBGR(byte[] stream) - { - if (stream == null) return(null); + // Our PDF stream is in RGB order. + byte[] stream = dictionary.Stream.UnfilteredValue; - for (int x = 0, length = stream.Length; x < length; x += 3) { - byte red = stream[x]; + if (LayerExtensions.HasError(stream, out var decodeErr)) + if ((stream = dictionary.Stream.RepeatUnFilter()) == null) + throw new InvalidOperationException(decodeErr); - stream[x] = stream[x+2]; - stream[x+2] = red; + if (imageData.ColorSpace.IsRGB) + { + if (isIndexed && format.IsIndexed()) + { + var copy = stream; + if (format == PixelFormat.Format4bppIndexed) + copy = stream.FixB4ToI8(); + + using var img = Image.LoadPixelData(copy, imageData.Width, imageData.Height); + return img.ApplyColorPalette(palette); + } + if (format == PixelFormat.Format24bppRgb) + { + return Image.LoadPixelData(stream, imageData.Width, imageData.Height); + } } - return (stream); + + throw new InvalidOperationException($"{imageData.ColorSpace} {format}"); } private static PdfDictionary ProcessFilters(PdfDictionary dictionary) @@ -255,7 +260,7 @@ private static PdfDictionary ProcessFilters(PdfDictionary dictionary) var map = new Dictionary>() { { "/FlateDecode", (d) => { var decoder = new FlateDecode(); - return (decoder.Decode(d)); + return (decoder.Decode(d, dictionary)); } } }; @@ -378,6 +383,7 @@ protected virtual IEnumerable GetRawPalette(PdfItem item) // The palette data is directly imbedded. if (item.IsArray()) return(GetRawPalette(item as PdfArray)); if (item.IsReference()) return (GetRawPalette(item as PdfReference)); + // TODO if (item is PdfString pdfString) return new RawEncoding().GetBytes(pdfString.Value); throw new ArgumentException("The specified palette information was incorrect.", "item"); } @@ -507,7 +513,7 @@ protected virtual IEnumerable GetColorPalette() int offset = 3; byte[] values = GetRawPalette(_colorSpace).ToArray(); for (int color = 0, length = Colors; color < length; color++) { - yield return (Color.FromArgb(values[color * offset], values[(color * offset) + 1], values[(color * offset) + 2])); + yield return (Color.FromRgb(values[color * offset], values[(color * offset) + 1], values[(color * offset) + 2])); } } @@ -558,8 +564,8 @@ public ColorPalette ToColorPalette(int bitsPerPixel) ColorPalette palette = PdfIndexedRGBColorSpace.CreateColorPalette(colors); Parallel.For(0, colors, (color) => { - int gray = (int) Math.Floor((256f - 1)/ (colors - 1) * color); - palette.Entries[color] = Color.FromArgb(gray, gray, gray); + byte gray = (byte) Math.Floor((256f - 1)/ (colors - 1) * color); + palette.Entries[color] = Color.FromRgb(gray, gray, gray); }); return (palette); } @@ -641,7 +647,6 @@ public static PdfColorSpace Parse(PdfItem colorSpace) // Standard CMYK Colorspace { "/DeviceCMYK", (a) => { throw new NotImplementedException("CMYK encoded images are not supported."); - return (new PdfCMYKColorSpace()); } }, }; @@ -673,6 +678,9 @@ class PdfDictionaryImageMetaData /// The colorspace information for the image. public PdfColorSpace ColorSpace { get; set; } + /// The Compression for the image. + public Compression Compression { get; set; } + /// The dictionary object o parse. public PdfDictionaryImageMetaData(PdfDictionary dictionary) { @@ -698,6 +706,8 @@ private void Initialize(PdfDictionary dictionary) ColorSpace = PdfDictionaryColorSpace.Parse(colorSpace); } else ColorSpace = new PdfRGBColorSpace(); // Default to RGB Color Space + + Compression = Compression.CCITTFAX4; } /// @@ -709,7 +719,7 @@ private void Initialize(PdfDictionary dictionary) /// 2 public override string ToString() { - Func> palette = () => ((PdfIndexedRGBColorSpace)ColorSpace).Palette.Select((c,i) => String.Format("[{0:000}]{1:x2}{2:x2}{3:x2}{4:x2}", i, c.A, c.R, c.G, c.B)); + Func> palette = () => ((PdfIndexedRGBColorSpace)ColorSpace).Palette.Unpack().Select((c,i) => String.Format("[{0:000}]{1:x2}{2:x2}{3:x2}{4:x2}", i, c.A, c.R, c.G, c.B)); return (String.Format("{0}x{1} @ {2}bpp{3}", Width, Height, BitsPerPixel, (ColorSpace is PdfIndexedRGBColorSpace) ? String.Format(" /Indexed({0}): {1}", ((PdfIndexedColorSpace) ColorSpace).Colors, String.Join(", ", palette.Invoke())) : null)); diff --git a/Pdf/PdfDocumentExtensions.cs b/PDFSharp.Extensions/Pdf/PdfDocumentExtensions.cs similarity index 94% rename from Pdf/PdfDocumentExtensions.cs rename to PDFSharp.Extensions/Pdf/PdfDocumentExtensions.cs index e83c490..50d101b 100644 --- a/Pdf/PdfDocumentExtensions.cs +++ b/PDFSharp.Extensions/Pdf/PdfDocumentExtensions.cs @@ -1,6 +1,7 @@ using System; using System.Collections.Generic; -using System.Drawing; +using SixLabors.ImageSharp; +using PdfSharpCore.Pdf; // ReSharper disable once CheckNamespace namespace PdfSharp.Pdf diff --git a/PDFSharp.Extensions/Pdf/PdfHotFixExtensions.cs b/PDFSharp.Extensions/Pdf/PdfHotFixExtensions.cs new file mode 100644 index 0000000..8724a7e --- /dev/null +++ b/PDFSharp.Extensions/Pdf/PdfHotFixExtensions.cs @@ -0,0 +1,25 @@ +using System.IO.Compression; +using System.IO; +using PdfSharpCore.Pdf; + +namespace PDFSharp.Extensions.Pdf +{ + internal static class PdfHotFixExtensions + { + public static byte[] RepeatUnFilter(this PdfDictionary.PdfStream dictStream) + { + var input = dictStream.Value; + if (input == null) + return null; + + using var archive = new MemoryStream(input); + archive.Position = 2; // Skip header + + using var output = new MemoryStream(); + using var deflate = new DeflateStream(archive, CompressionMode.Decompress); + deflate.CopyTo(output); + + return output.ToArray(); + } + } +} \ No newline at end of file diff --git a/Pdf/PdfItemExtensions.cs b/PDFSharp.Extensions/Pdf/PdfItemExtensions.cs similarity index 98% rename from Pdf/PdfItemExtensions.cs rename to PDFSharp.Extensions/Pdf/PdfItemExtensions.cs index 9560911..ba8206d 100644 --- a/Pdf/PdfItemExtensions.cs +++ b/PDFSharp.Extensions/Pdf/PdfItemExtensions.cs @@ -1,4 +1,5 @@ -using PdfSharp.Pdf.Advanced; +using PdfSharpCore.Pdf; +using PdfSharpCore.Pdf.Advanced; // ReSharper disable once CheckNamespace namespace PdfSharp.Pdf diff --git a/Pdf/PdfPageExtensions.cs b/PDFSharp.Extensions/Pdf/PdfPageExtensions.cs similarity index 94% rename from Pdf/PdfPageExtensions.cs rename to PDFSharp.Extensions/Pdf/PdfPageExtensions.cs index f641b4b..c9c696c 100644 --- a/Pdf/PdfPageExtensions.cs +++ b/PDFSharp.Extensions/Pdf/PdfPageExtensions.cs @@ -1,7 +1,8 @@ using System; using System.Collections.Generic; -using System.Drawing; -using PdfSharp.Pdf.Advanced; +using SixLabors.ImageSharp; +using PdfSharpCore.Pdf; +using PdfSharpCore.Pdf.Advanced; // ReSharper disable once CheckNamespace namespace PdfSharp.Pdf diff --git a/PDFSharp.Extensions/System/Drawing/ColorPalette.cs b/PDFSharp.Extensions/System/Drawing/ColorPalette.cs new file mode 100644 index 0000000..97c60bf --- /dev/null +++ b/PDFSharp.Extensions/System/Drawing/ColorPalette.cs @@ -0,0 +1,15 @@ +using SixLabors.ImageSharp; + +// ReSharper disable once CheckNamespace +namespace PdfSharp.Pdf.Drawing +{ + public sealed class ColorPalette + { + public Color[] Entries { get; } + + public ColorPalette(int count = 1) + { + Entries = new Color[count]; + } + } +} diff --git a/System/Drawing/ImageExtensions.cs b/PDFSharp.Extensions/System/Drawing/ImageExtensions.cs similarity index 88% rename from System/Drawing/ImageExtensions.cs rename to PDFSharp.Extensions/System/Drawing/ImageExtensions.cs index 0d1dc10..c9a0a84 100644 --- a/System/Drawing/ImageExtensions.cs +++ b/PDFSharp.Extensions/System/Drawing/ImageExtensions.cs @@ -1,9 +1,10 @@ using System.Collections.Generic; -using PdfSharp.Drawing; -using PdfSharp.Pdf; +using SixLabors.ImageSharp; +using PdfSharpCore.Drawing; +using PdfSharpCore.Pdf; // ReSharper disable once CheckNamespace -namespace System.Drawing +namespace PdfSharp.Pdf.Drawing { /// /// Extension methods for the class. @@ -38,7 +39,7 @@ public static PdfDocument ToPdf(this IEnumerable images) document.AddPage(page); XGraphics xGraphics = XGraphics.FromPdfPage(page); - XImage xImage = XImage.FromGdiPlusImage(image); + XImage xImage = XImage.FromImageSource(image.ToSource()); xGraphics.DrawImage(xImage, 0, 0, image.Width, image.Height); } return (document); diff --git a/PDFSharp.Extensions/System/Drawing/LayerExtensions.cs b/PDFSharp.Extensions/System/Drawing/LayerExtensions.cs new file mode 100644 index 0000000..0b478bd --- /dev/null +++ b/PDFSharp.Extensions/System/Drawing/LayerExtensions.cs @@ -0,0 +1,91 @@ +using System.Linq; +using System.Text; +using BitMiracle.LibTiff.Classic; +using MigraDocCore.DocumentObjectModel.MigraDoc.DocumentObjectModel.Shapes; +using PdfSharpCore.Utils; +using SixLabors.ImageSharp; +using SixLabors.ImageSharp.Formats; +using SixLabors.ImageSharp.Formats.Png; +using SixLabors.ImageSharp.PixelFormats; + +// ReSharper disable once CheckNamespace +namespace PdfSharp.Pdf.Drawing +{ + public static class LayerExtensions + { + public static ImageSource.IImageSource ToSource(this Image image, + int quality = 100, IImageFormat format = null) + { + var fmt = format ?? PngFormat.Instance; + var copy = image.CloneAs(); + var source = ImageSharpImageSource.FromImageSharpImage(copy, fmt, quality); + return source; + } + + public static Rgba32[] Unpack(this Color[] colors) + { + return colors.Select(c => c.ToPixel()).ToArray(); + } + + public static bool IsIndexed(this PixelFormat format) + { + return format + is PixelFormat.Format8bppIndexed + or PixelFormat.Format4bppIndexed + or PixelFormat.Format1bppIndexed; + } + + public static Image ApplyColorPalette(this Image image, ColorPalette palette) + { + Image target = new(image.Width, image.Height); + image.ProcessPixelRows(target, (srcAcc, dstAcc) => + { + for (var y = 0; y < srcAcc.Height; y++) + { + var srcRow = srcAcc.GetRowSpan(y); + var dstRow = dstAcc.GetRowSpan(y); + for (var x = 0; x < srcRow.Length; x++) + { + ref var srcPixel = ref srcRow[x]; + ref var dstPixel = ref dstRow[x]; + var color = palette.Entries[srcPixel.PackedValue]; + dstPixel = color; + } + } + }); + return target; + } + + public static Rgba32 FromPacked(this int bits) + { + return new Rgba32(Tiff.GetR(bits), Tiff.GetG(bits), Tiff.GetB(bits), Tiff.GetA(bits)); + } + + public static byte[] FixB4ToI8(this byte[] input) + { + const int step = 2; + var copy = new byte[input.Length * step]; + for (var i = 0; i < input.Length; i++) + { + var raw = input[i]; + var p = i * step; + copy[p + 0] = (byte)(raw >> 4); + copy[p + 1] = (byte)(raw & 0x0F); + } + return copy; + } + + public static bool HasError(byte[] bytes, out string error) + { + if (bytes.Length == 37) + { + var text = Encoding.UTF8.GetString(bytes); + error = text.Trim((char)65533); + return true; + } + + error = default; + return false; + } + } +} diff --git a/PDFSharp.Extensions/System/Drawing/PixelFormat.cs b/PDFSharp.Extensions/System/Drawing/PixelFormat.cs new file mode 100644 index 0000000..5de97a8 --- /dev/null +++ b/PDFSharp.Extensions/System/Drawing/PixelFormat.cs @@ -0,0 +1,18 @@ + +// ReSharper disable once CheckNamespace +// ReSharper disable InconsistentNaming +namespace PdfSharp.Pdf.Drawing +{ + public enum PixelFormat + { + Undefined = 0, + + Format1bppIndexed = 196865, + + Format4bppIndexed = 197634, + + Format8bppIndexed = 198659, + + Format24bppRgb = 137224 + } +} diff --git a/Properties/AssemblyInfo.cs b/Properties/AssemblyInfo.cs deleted file mode 100644 index 4bd665f..0000000 --- a/Properties/AssemblyInfo.cs +++ /dev/null @@ -1,35 +0,0 @@ -using System.Reflection; -using System.Runtime.InteropServices; - -// General Information about an assembly is controlled through the following -// set of attributes. Change these attribute values to modify the information -// associated with an assembly. -[assembly: AssemblyTitle("PDFSharp.Extensions")] -[assembly: AssemblyDescription("A set of extensions for the PDFSharp.NET library.")] -[assembly: AssemblyConfiguration("")] -[assembly: AssemblyCompany("n/a")] -[assembly: AssemblyProduct("PDFSharp.Extensions")] -[assembly: AssemblyCopyright("Copyright © 2014, George Heeres (gheeres@gmail.com)")] -[assembly: AssemblyTrademark("Licensed under MIT")] -[assembly: AssemblyCulture("")] - -// Setting ComVisible to false makes the types in this assembly not visible -// to COM components. If you need to access a type in this assembly from -// COM, set the ComVisible attribute to true on that type. -[assembly: ComVisible(false)] - -// The following GUID is for the ID of the typelib if this project is exposed to COM -[assembly: Guid("047162b6-b659-496b-bab3-57dc8be2c677")] - -// Version information for an assembly consists of the following four values: -// -// Major Version -// Minor Version -// Build Number -// Revision -// -// You can specify all the values or you can default the Build and Revision Numbers -// by using the '*' as shown below: -// [assembly: AssemblyVersion("1.0.*")] -[assembly: AssemblyVersion("0.1.2.2")] -[assembly: AssemblyFileVersion("0.1.2.2")] diff --git a/README.md b/README.md index a6a30e4..e67bc94 100644 --- a/README.md +++ b/README.md @@ -10,6 +10,22 @@ Licensed under the MIT license. --------------------------------------- +Install and use it with + +``` +dotnet add package PdfSharpCore.Extensions +``` + +--------------------------------------- + +This fork includes some maintenance from: +* https://github.com/ghosttie/PDFSharp.Extensions/tree/patch-1 +* https://github.com/sk2andy/PDFSharp.Extensions +* https://github.com/jaykay-design/PDFSharp.Extensions +* https://github.com/vanderkorn/PDFSharp.Extensions + +--------------------------------------- + Image Utilities ----------- Extension methods are provided for extracting images from an entire document, @@ -17,7 +33,7 @@ individual pages or specific images. Currently only RGB encoded images (/DeviceR are supported with either /DCTDecode or /FlatEncode encoding. /Indexed colorspaces are also supported for /FlatEncode images including 1bpp images (black & white). -All images are extracted as System.Drawing.Image obects which can then be saved or +All images are extracted as SixLabors.ImageSharp.Image objects which can then be saved or manipulated as necessary. __Example__ diff --git a/packages.config b/packages.config deleted file mode 100644 index b2ff103..0000000 --- a/packages.config +++ /dev/null @@ -1,5 +0,0 @@ - - - - - \ No newline at end of file