#1 2017-05-19 05:48

Anynomus
Member
Registered: 2017-05-18
Posts: 1

How TrID library for detecting file extension works

I have studied for many days about File Signature and Mime Type. For Accurate File Detection System.,i Could not make it to work.Since so many signature same for many extension (Ex. doc,xls,msi have same extension as well as same MimeType when you use signature to detect)
I would like to learn more about it how trid exactly find beyond signature if anyone would like to share.

Last edited by Anynomus (2017-05-19 05:49)

Offline

#2 2017-05-19 10:19

den4b
Administrator
From: den4b.com
Registered: 2006-04-06
Posts: 3,367

Re: How TrID library for detecting file extension works

All file type detection algorithms work in the same way.

First, you need to analyze many files of the same type to establish a common file content pattern (binary signature) which is representative of that file type. This part can vary in complexity depending on the file format and desired accuracy of detection. The more file types you analyze, the bigger your dictionary will get, hence, the more file types you should be able to identify.

Then, to detect a type of unknown file you will check it against all your known signatures to identify the best match. Sometimes you will get more than one match because of various ambiguities and varying quality of your original file type analysis.

Here is an example of open file signature database: filext.com

More relevant resources: file-extension.net

There are tons of information on the internet, all you need to do is look for it.

Offline

Board footer

Powered by FluxBB