Convert PDF to DWG with Object Recognition

Updated Jul 8, 2019
Identifying objects in an architectural drawing

Since its launch in 1993, the PDF format has come to reign supreme in the field of document sharing. The high fidelity with which it displays a range of textual and graphical information makes it an obvious choice for collaborative working. What it doesn’t cope with so well, however, is editing. For this reason, converting PDFs to easily editable formats, such as DWG, is not only popular, but necessary. To stay faithful to the document’s original contents, however, users need to opt for a program that can convert PDF to DWG with object recognition.

It would be no stretch to say that, without object recognition, it simply isn’t possible to achieve good-quality conversion results. That’s because object recognition tools discern how to map each object in your document onto the most suitable vector element. As a result, you’ll end up with a vector file that you can actually use and edit in CAD software.

To gain a better understanding of how converting a PDF to DWG with object recognition works, watch our video below. Don’t forget to read the full article for in-depth information on the conversion process!


Video: Converting a vector PDF to DWG with object recognition

View video transcript

In this video will look deeper at Scan2CAD’s object recognition, specifically when importing a PDF, and also when the PDF is 100% vector. So you may have already seen Scan2CAD’s object recognition when used on raster elements, i.e., an image which is made up of pixels, where you can use OCR and object identification to find circles, or dash lines and so on, and convert them into fully editable vector CAD elements.

In this case, we’re going to look at the same technology, but when the elements are vector. We’ve loaded in a PDF, and the PDF, as I say, is 100% vector. And I’m going to go to View, View Vector Colors, so we can see visually the type of vectors that make up this electrical schematic. In Scan2CAD, red represents vector line elements. So you can see that the whole contents of this PDF is made up of a collection of exploded vectors.

As we zoom in, we can see we’ve got text here, arrow lines, dash lines, and circles, but all of these elements are formed up of many, many individual vector elements. And this is actually quite common, where some applications, when saving out to a PDF, kind of dumb down the contents of it, and it’s a common requirement where you want to use some sort of object recognition to convert those vector elements to their correct CAD elements.

So one of the things you may want to do is to run OCR, and that’s most commonly used on raster text. Scan2CAD does actually have the feature of OCR on vector elements, and we have other elements, as we said, like dash lines and so on. But we’re just gonna focus on one requirement, which is to convert these exploded circles into singular circle elements, which will be really easy to edit in Scan2CAD, or if we pulled it into another CAD application like AutoCAD and so on. So to do this, we use Scan2CAD’s vector optimization feature. We’ll click that option, and we can see a bunch of features here for converting elements in this vector file.

So, for example, we have convert all solids to non-solids, make beziers, arcs, polygons, recognition of dash lines, arrow lines, and so on. In this case, we just want to make circles. So this is going to look for all objects within the file, all vector objects, that is, and convert them to a circle where appropriate and remove the previous objects which represented it. So we’re just gonna click OK and let Scan2CAD do its work. We can see it’s thinking there, and it’s complete.

So you’ll notice there’s some new colors in the image now. We can view the true colors if we want. You can see that’s not been affected, but if we go back to viewing the colour by vector type, we see that we’ve now got blue elements representing circles. So where we previously saw these circles made up of many individual vectors, we’ve now got a fully editable circle element which we could move and do what we want with. And when we save this out to either DXF or DWG from the original PDF, that element will be there and could be pulled into another application.


Why convert from PDF to DWG?

If you work in any design-related field, you’ll have one goal in mind: creating something useful. The process behind this will involve multiple iterations of a single design, and, often, the repurposing of existing designs to create new ones. Designing something completely perfect in one try, with no reworking or editing, is nigh on impossible. Editing is essential to the process.

Unfortunately, PDF files, which are so common for document storage, are terrible at this. There’s a good reason why: their contents should look exactly the same on any machine. This makes them, in some ways, the digital form of printing out a definitive copy on paper. For this reason, they’re an ideal choice for sharing. However, it’s also why, if you want to edit them in any way, you’ll need to convert to another format.

If you’re working with CAD, one of the top formats to choose is DWG. The DWG file format has a long-standing association with AutoCAD, the market-leading CAD software from AutoCAD. However, there now exists a variety of ways to view DWG files without AutoCAD, with an array of competing software able to display the format.

Put simply, converting from PDF to DWG opens doors for designers. It’s what allows a user to turn a static image into a fully editable 3D digital model. From here, the possibilities are endless across fields as diverse as engineering, architecture and product design. Firstly, though, you’ll need to find a software that can convert PDF to DWG for you. As we’ll discuss next, this isn’t always the easiest task.


Why it’s important to convert PDF to DWG with object recognition

Comparing an exploded vector circle with a circle entity

Without object recognition you cannot convert exploded vector shapes into their correct entities.

PDFs can be complex documents, with the ability to store vector, raster and text elements. DWGs, meanwhile, are vector files, and every element within them is editable. As a result, converting from PDF to DWG can be a tricky process.

Plenty of programs out there claim to be able to do a good job at this type of conversion. However, the results are often patchy. You can see an example of this in the video above, where the output image contains exploded vectors. This means that, instead of converting each element within an image to a suitable vector element, the conversion program has only drawn simple vector lines.

If you were unfamiliar with CAD, this may not seem so bad. After all, you’ve ended up with a vector image that looks like a faithful representation of the original PDF. Job done, right? Wrong. The kind of image seen at the start of the above video is next to useless for CAD work. The exploded text in the video above, for example, isn’t really text, but a series of lines. This means you can’t type over it—making editing annoying and difficult.

A good example of conversion output would separate these elements into vector text, circles, arcs, polylines, and so on. In order for conversion software to do this, it needs to have object recognition tools in its arsenal. These tools enable conversion software to discern which vector elements are the most suitable fit for parts of a raster image. In the video above, you can see just one example of this, with the conversion of vector lines to circles.

Raster and vector object recognition

When we think of object recognition, we usually think of raster images. However, given that PDFs can contain a range of different content, it’s worth noting that object recognition can encompass raster, vector and text.

In the raster-to-vector conversion process, this involves tracing appropriate vector elements over raster features. When this includes raster text, Optical Character Recognition, a separate, albeit similar, tool, can also come into play.

However, as we’ve seen, even a vector PDF won’t necessarily use the most suitable vector elements. The video above shows one of the more extreme examples of this, but other issues to watch out for may include a single dashed line being treated as a series of small, separate lines.

It’s easy to understand why this isn’t ideal when it comes to editing. Sadly, this is a very common problem with PDFs, which only support a limited range of vector elements. It’s for this reason that vector PDFs often replace complex elements, such as Bezier curves or arcs, with lines.

Object recognition can change this type of ‘dumb’ vector design into an image that you can edit with CAD software. However, not all conversion tools are able to do this well. That’s why choosing the right software is key.


Choosing software to convert PDF to DWG with object recognition

It’s clear why object recognition matters so much when converting from PDF to DWG. The trickier thing is figuring out which conversion software excels at object recognition, and which lags behind. Luckily, we have a few pointers you can follow.

Firstly, be wary of online converters. Though it doesn’t take long to find a PDF to DWG converter on Google, the quality of such tools is often extremely lacking. In fact, such programs are often unable to create anything more useful than the image in the video above! Other key pitfalls of online converters include safety issues, and threats to the security of your intellectual property. All in all, it’s just not worth the risk.

Some converters do manage to get halfway to a decent conversion job. However, they get sloppy when it comes to some of the finer points. Take text, for example: an average converter may do an admirable job of detecting raster text within your PDF and converting this to vector text. They fall down, though, when it comes to arranging that text into useful strings. Take a look at the image below to see what to aim for—and what to avoid.

On the left is text converted using an online converter. On the right is text converted by Scan2CAD.

On the left is an example of the unintuitive vector text other PDF-to-DWG converters produce. On the right is text converted by Scan2CAD and arranged into logical text strings.

The best solution, therefore, is to opt for a PDF-to-DWG conversion program with a track record of professional results. Enter Scan2CAD. With over 20 years in the business, we’ve learned a thing or two about what it takes to produce usable vector images. Read on to see what we have to offer.


Object recognition with raster images in Scan2CAD

As PDF files can contain both raster and vector images, you’ll need software that supports raster and vector object recognition. Scan2CAD’s advanced object recognition tools mean it’s one step ahead of other conversion software. Its recognition engine starts using preset vectorization settings; you also have the ability to alter these settings to suit your needs. In the video tutorial above, for example, you’ll can see how you can input personalized settings for features such as hatching, arrow lines, and dashed lines.

Additionally, Scan2CAD is able to combine object recognition with OCR. In fact, by running these two tools together on your PDF, you can obtain great vectorization results with distinct vector and text elements in just seconds.

scan2cad advert for free trial