#1 2007-09-27 05:52

Yogui
Member
Registered: 2007-09-27
Posts: 41

Preset to check english misspelings

Hi,

I came across the following mispellings list from wikipedia and was thinking to do a BIG preset to check mp3s against misspeliing errors.

Does anyone have done this already?

If so, Please post link to it.

Is there a limit of rules that can be added to a preset?

This misspelling preset may have a total of 6000 rules aprox?

How performance would be afected?

Cheers, yogui
big_smile
PD1: The list is too long to post +5000 lines
PD2: Ofcourse I'm planing to use some rules editing the rnp preset in notepad++ etc, won't even try to enter it manually! yikes


;-------------------------------------------------------------------------------
; Common Misspellings
;-------------------------------------------------------------------------------
::abandonned::abandoned
::aberation::aberration
::abilties::abilities
::abilty::ability
::abondon::abandon
::abondoned::abandoned
::abondoning::abandoning
::abondons::abandons
::aborigene::aborigine
::abortificant::abortifacient
::abreviated::abbreviated
::abreviation::abbreviation
::abritrary::arbitrary
::absense::absence
::absolutly::absolutely
::absorbsion::absorption
::absorbtion::absorption
::abundacies::abundances
::abundancies::abundances
::abundunt::abundant
::abutts::abuts
::acadamy::academy
::acadmic::academic
::accademic::academic
::accademy::academy
::acccused::accused
::accelleration::acceleration
::acceptence::acceptance
::acceptible::acceptable
::accessable::accessible
::accidentaly::accidentally
::accidently::accidentally
::acclimitization::acclimatization
::accomadate::accommodate
::accomadated::accommodated
::accomadates::accommodates
::accomadating::accommodating
::accomadation::accommodation
::accomadations::accommodations
::accomdate::accommodate
::accomodate::accommodate
::accomodated::accommodated
::accomodates::accommodates
::accomodating::accommodating
::accomodation::accommodation
::accomodations::accommodations
::accompanyed::accompanied
::accordeon::accordion
::accordian::accordion
::accoring::according
::accoustic::acoustic
::accquainted::acquainted
::accross::across
::accussed::accused
::acedemic::academic
::acheive::achieve
::acheived::achieved
::acheivement::achievement
::acheivements::achievements
::acheives::achieves
::acheiving::achieving
::acheivment::achievement
::acheivments::achievements
::achievment::achievement
::achievments::achievements
::achivement::achievement
::achivements::achievements
::acknowldeged::acknowledged
::acknowledgeing::acknowledging
::acomplish::accomplish
::acomplished::accomplished
::acomplishment::accomplishment
::acomplishments::accomplishments
::acording::according
::acordingly::accordingly
::acquaintence::acquaintance
::acquaintences::acquaintances
::acquiantence::acquaintance
::acquiantences::acquaintances
::acquited::acquitted
::activites::activities
::activly::actively
::actualy::actually
::acuracy::accuracy
::acused::accused
::acustom::accustom
::acustommed::accustomed
::adaption::adaptation
::adaptions::adaptations
::adavanced::advanced
::adbandon::abandon
::additinally::additionally
::addmission::admission
::addopt::adopt
::addopted::adopted
::addoptive::adoptive
::addresable::addressable
::addresed::addressed
::addresing::addressing
::addressess::addresses
::addtion::addition
::addtional::additional
::adecuate::adequate
::adhearing::adhering
::adherance::adherence
::admendment::amendment
::admininistrative::administrative
::adminstered::administered
::adminstrate::administrate
::adminstration::administration
::adminstrative::administrative
::adminstrator::administrator
::admissability::admissibility
::admissable::admissible
::admited::admitted
::admitedly::admittedly
::adn::and
::adolecent::adolescent
::adquire::acquire
::adquired::acquired
::adquires::acquires
::adquiring::acquiring
::adres::address
::adresable::addressable
::adresing::addressing
::adress::address
::adressable::addressable
::adressed::addressed
::adventrous::adventurous
::advertisment::advertisement
::advertisments::advertisements
::advesary::adversary
::adviced::advised
::aeriel::aerial
::aeriels::aerials
::afair::affair
::afficianados::aficionados
::afficionado::aficionado
::afficionados::aficionados
::affilate::affiliate
::affilliate::affiliate
::aforememtioned::aforementioned
::againnst::against
::agains::against
::agaisnt::against
::aganist::against
::aggaravates::aggravates
::aggreed::agreed
::aggreement::agreement
::aggregious::egregious
::aggresive::aggressive
::agian::again
::agianst::against
::agin::again
::aginst::against
::agravate::aggravate
::agre::agree
::agred::agreed
::agreeement::agreement
::agreemnt::agreement
::agregate::aggregate
::agregates::aggregates
::agreing::agreeing
::agression::aggression
::agressive::aggressive
::agressively::aggressively
::agressor::aggressor
::agricuture::agriculture
::agrieved::aggrieved
::ahev::have
::ahppen::happen
::ahve::have
::aicraft::aircraft
::aiport::airport
::airbourne::airborne
::aircaft::aircraft
::aircrafts::aircraft
::airporta::airports
::airrcraft::aircraft

Offline

#2 2007-09-27 10:49

den4b
Administrator
From: den4b.com
Registered: 2006-04-06
Posts: 3,379

Re: Preset to check english misspelings

I don't think anyone yet tried more than 1000 rules! yikes

But there is not reason why it shouldn't work... Yes, it will affect the performance.

There are few ways for doing this operation with just 1 rule:

1) Using single Replace rule, with multiple items delimiter *|*

2) Using single PascalScript rule, which will load those words straight from the external text file

I would recommend the first one, for the best performance!

Offline

#3 2007-09-28 04:36

Yogui
Member
Registered: 2007-09-27
Posts: 41

Re: Preset to check english misspelings

Thanks for your quick reply Denis! smile

I think for program does a lot more that many users know (including myself ofcourse!)

If I'm not mistaken this trick *|* does not apper documented.

I'm I missing some manual and asking the same silly questions? roll

Well here is where I'm now:

I need to match all the FIND1, FIND2... with the REPLACE1, REPLACE2

I copy the code just for "A", At the moment I don't have a clue how to do the match in notepad++ tongue

Do you?

A 3rd option will be create one rule per correction that will be arround +6000 simpler rules.

I imagine that if they orderd alphabetically wont be that hard to mantain and upgrade

Do you think performance would be better or wrose than 1 huge rule?

I'm moving tommorrow so I'll be off the net for a couple of days.

Thanks, Again


[Rule0]
Name=Replace
Config=TEXTWHAT:%7EFIND1%7E%2A%7C%2A%7EFIND2%7E%2A%7C%2A%7EFIND3%7E;TEXTWITH:%7EREPLACE1%7E%2A%7C%2A%7EREPLACE2%7E%2A%7C%2A%7EREPLACE3%7E;WHICH:3;SKIPEXTENSION:1;CASESENSITIVE:0;USEWILDCARDS:0
Marked=1


::abandonned::abandoned
::aberation::aberration
::abilties::abilities
::abilty::ability
::abondon::abandon
::abondoned::abandoned
::abondoning::abandoning
::abondons::abandons
::aborigene::aborigine
::abortificant::abortifacient
::abreviated::abbreviated
::abreviation::abbreviation
::abritrary::arbitrary
::absense::absence
::absolutly::absolutely
::absorbsion::absorption
::absorbtion::absorption
::abundacies::abundances
::abundancies::abundances
::abundunt::abundant
::abutts::abuts
::acadamy::academy
::acadmic::academic
::accademic::academic
::accademy::academy
::acccused::accused
::accelleration::acceleration
::acceptence::acceptance
::acceptible::acceptable
::accessable::accessible
::accidentaly::accidentally
::accidently::accidentally
::acclimitization::acclimatization
::accomadate::accommodate
::accomadated::accommodated
::accomadates::accommodates
::accomadating::accommodating
::accomadation::accommodation
::accomadations::accommodations
::accomdate::accommodate
::accomodate::accommodate
::accomodated::accommodated
::accomodates::accommodates
::accomodating::accommodating
::accomodation::accommodation
::accomodations::accommodations
::accompanyed::accompanied
::accordeon::accordion
::accordian::accordion
::accoring::according
::accoustic::acoustic
::accquainted::acquainted
::accross::across
::accussed::accused
::acedemic::academic
::acheive::achieve
::acheived::achieved
::acheivement::achievement
::acheivements::achievements
::acheives::achieves
::acheiving::achieving
::acheivment::achievement
::acheivments::achievements
::achievment::achievement
::achievments::achievements
::achivement::achievement
::achivements::achievements
::acknowldeged::acknowledged
::acknowledgeing::acknowledging
::acomplish::accomplish
::acomplished::accomplished
::acomplishment::accomplishment
::acomplishments::accomplishments
::acording::according
::acordingly::accordingly
::acquaintence::acquaintance
::acquaintences::acquaintances
::acquiantence::acquaintance
::acquiantences::acquaintances
::acquited::acquitted
::activites::activities
::activly::actively
::actualy::actually
::acuracy::accuracy
::acused::accused
::acustom::accustom
::acustommed::accustomed
::adaption::adaptation
::adaptions::adaptations
::adavanced::advanced
::adbandon::abandon
::additinally::additionally
::addmission::admission
::addopt::adopt
::addopted::adopted
::addoptive::adoptive
::addresable::addressable
::addresed::addressed
::addresing::addressing
::addressess::addresses
::addtion::addition
::addtional::additional
::adecuate::adequate
::adhearing::adhering
::adherance::adherence
::admendment::amendment
::admininistrative::administrative
::adminstered::administered
::adminstrate::administrate
::adminstration::administration
::adminstrative::administrative
::adminstrator::administrator
::admissability::admissibility
::admissable::admissible
::admited::admitted
::admitedly::admittedly
::adn::and
::adolecent::adolescent
::adquire::acquire
::adquired::acquired
::adquires::acquires
::adquiring::acquiring
::adres::address
::adresable::addressable
::adresing::addressing
::adress::address
::adressable::addressable
::adressed::addressed
::adventrous::adventurous
::advertisment::advertisement
::advertisments::advertisements
::advesary::adversary
::adviced::advised
::aeriel::aerial
::aeriels::aerials
::afair::affair
::afficianados::aficionados
::afficionado::aficionado
::afficionados::aficionados
::affilate::affiliate
::affilliate::affiliate
::aforememtioned::aforementioned
::againnst::against
::agains::against
::agaisnt::against
::aganist::against
::aggaravates::aggravates
::aggreed::agreed
::aggreement::agreement
::aggregious::egregious
::aggresive::aggressive
::agian::again
::agianst::against
::agin::again
::aginst::against
::agravate::aggravate
::agre::agree
::agred::agreed
::agreeement::agreement
::agreemnt::agreement
::agregate::aggregate
::agregates::aggregates
::agreing::agreeing
::agression::aggression
::agressive::aggressive
::agressively::aggressively
::agressor::aggressor
::agricuture::agriculture
::agrieved::aggrieved
::ahev::have
::ahppen::happen
::ahve::have
::aicraft::aircraft
::aiport::airport
::airbourne::airborne
::aircaft::aircraft
::aircrafts::aircraft
::airporta::airports
::airrcraft::aircraft
::aisian::asian
::albiet::albeit
::alchohol::alcohol
::alchoholic::alcoholic
::alchol::alcohol
::alcholic::alcoholic
::alcohal::alcohol
::alcoholical::alcoholic
::aledge::allege
::aledged::alleged
::aledges::alleges
::alege::allege
::aleged::alleged
::alegience::allegiance
::algebraical::algebraic
::algorhitms::algorithms
::algoritm::algorithm
::algoritms::algorithms
::alientating::alienating
::alledge::allege
::alledged::alleged
::alledgedly::allegedly
::alledges::alleges
::allegedely::allegedly
::allegedy::allegedly
::allegely::allegedly
::allegence::allegiance
::allegience::allegiance
::allign::align
::alligned::aligned
::alliviate::alleviate
::allopone::allophone
::allopones::allophones
::allready::already
::allthough::although
::alltime::all-time
::almsot::almost
::alochol::alcohol
::alomst::almost
::alotted::allotted
::alowed::allowed
::alowing::allowing
::alreayd::already
::alse::else
::alsot::also
::alternitives::alternatives
::altho::although
::althought::although
::altough::although
::alwasy::always
::alwyas::always
::amalgomated::amalgamated
::amatuer::amateur
::amendmant::amendment
::amerliorate::ameliorate
::amke::make
::amking::making
::ammend::amend
::ammended::amended
::ammendment::amendment
::ammendments::amendments
::ammount::amount
::ammused::amused
::amoung::among
::amoungst::amongst
::amung::among
::analagous::analogous
::analitic::analytic
::analogeous::analogous
::anarchim::anarchism
::anarchistm::anarchism
::anbd::and
::ancestory::ancestry
::ancilliary::ancillary
::androgenous::androgynous
::androgeny::androgyny
::anihilation::annihilation
::aniversary::anniversary
::annoint::anoint
::annointed::anointed
::annointing::anointing
::annoints::anoints
::annouced::announced
::annualy::annually
::annuled::annulled
::anohter::another
::anomolies::anomalies
::anomolous::anomalous
::anomoly::anomaly
::anonimity::anonymity
::anounced::announced
::ansalisation::nasalisation
::ansalization::nasalization
::ansestors::ancestors
::antartic::antarctic
::anthromorphization::anthropomorphization
::anulled::annulled
::anwsered::answered
::anyhwere::anywhere
::anyother::any other
::anytying::anything
::aparent::apparent
::aparment::apartment
::apenines::apennines
::aplication::application
::aplied::applied
::apolegetics::apologetics
::apparant::apparent
::apparantly::apparently
::appart::apart
::appartment::apartment
::appartments::apartments
::appeareance::appearance
::appearence::appearance
::appearences::appearances
::appenines::apennines
::apperance::appearance
::apperances::appearances
::applicaiton::application
::applicaitons::applications
::appologies::apologies
::appology::apology
::apprearance::appearance
::apprieciate::appreciate
::approachs::approaches
::appropiate::appropriate
::appropraite::appropriate
::appropropiate::appropriate
::approproximate::approximate
::approxamately::approximately
::approxiately::approximately
::approximitely::approximately
::aprehensive::apprehensive
::apropriate::appropriate
::aproximate::approximate
::aproximately::approximately
::aquaintance::acquaintance
::aquainted::acquainted
::aquiantance::acquaintance
::aquire::acquire
::aquired::acquired
::aquiring::acquiring
::aquisition::acquisition
::aquitted::acquitted
::aranged::arranged
::arangement::arrangement
::arbitarily::arbitrarily
::arbitary::arbitrary
::archaelogists::archaeologists
::archaelogy::archaeology
::archetect::architect
::archetects::architects
::archetectural::architectural
::archetecturally::architecturally
::archetecture::architecture
::archiac::archaic
::archictect::architect
::archimedian::archimedean
::architechturally::architecturally
::architechture::architecture
::architechtures::architectures
::architectual::architectural
::archtype::archetype
::archtypes::archetypes
::aready::already
::areodynamics::aerodynamics
::argubly::arguably
::arguement::argument
::arguements::arguments
::arised::arose
::arival::arrival
::armamant::armament
::armistace::armistice
::aroud::around
::arrangment::arrangement
::arrangments::arrangements
::arround::around
::artical::article
::artice::article
::articel::article
::artifical::artificial
::artifically::artificially
::artillary::artillery
::arund::around
::asetic::ascetic
::asign::assign
::aslo::also
::asociated::associated
::asorbed::absorbed
::asphyxation::asphyxiation
::assasin::assassin
::assasinate::assassinate
::assasinated::assassinated
::assasinates::assassinates
::assasination::assassination
::assasinations::assassinations
::assasined::assassinated
::assasins::assassins
::assassintation::assassination
::assemple::assemble
::assertation::assertion
::asside::aside
::assisnate::assassinate
::assit::assist
::assitant::assistant
::assocation::association
::assoicate::associate
::assoicated::associated
::assoicates::associates
::assosication::assassination
::asssassans::assassins
::assualt::assault
::assualted::assaulted
::assymetric::asymmetric
::assymetrical::asymmetrical
::asteriod::asteroid
::asthetic::aesthetic
::asthetical::aesthetical
::asthetically::aesthetically
::asume::assume
::aswell::as well
::atain::attain
::atempting::attempting
::atheistical::atheistic
::athenean::athenian
::atheneans::athenians
::athiesm::atheism
::athiest::atheist
::atorney::attorney
::atribute::attribute
::atributed::attributed
::atributes::attributes
::attemp::attempt
::attemped::attempted
::attemt::attempt
::attemted::attempted
::attemting::attempting
::attemts::attempts
::attendence::attendance
::attendent::attendant
::attendents::attendants
::attened::attended
::attension::attention
::attitide::attitude
::attributred::attributed
::attrocities::atrocities
::audeince::audience
::austrailia::australia
::austrailian::australian
::auther::author
::authobiographic::autobiographic
::authobiography::autobiography
::authorative::authoritative
::authorites::authorities
::authorithy::authority
::authoritiers::authorities
::authoritive::authoritative
::authrorities::authorities
::autochtonous::autochthonous
::autoctonous::autochthonous
::automaticly::automatically
::automibile::automobile
::automonomous::autonomous
::autor::author
::autority::authority
::auxilary::auxiliary
::auxillaries::auxiliaries
::auxillary::auxiliary
::auxilliaries::auxiliaries
::auxilliary::auxiliary
::availablity::availability
::availaible::available
::availble::available
::availiable::available
::availible::available
::avalable::available
::avalance::avalanche
::avaliable::available
::avation::aviation
::avengence::a vengeance
::averageed::averaged
::avilable::available
::awared::awarded
::awya::away

Offline

#4 2007-09-30 09:06

LawOfNonContradiction
Member
From: USA
Registered: 2006-04-28
Posts: 45

Re: Preset to check english misspelings

Yogui wrote:

check mp3s against misspeliing errors

lol @ grammar

Please don't inflict  a spell check feature on us.  Such a "feature" would add bloat to your delightfully light program.   It seems most suitable as a preset or script.

Offline

#5 2007-09-30 21:26

den4b
Administrator
From: den4b.com
Registered: 2006-04-06
Posts: 3,379

Re: Preset to check english misspelings

Yogui:

You'll need to learn some regular expressions, to rearrange those 6000+ words: http://www.regular-expressions.info/

NOTE: 1 Huge rule will be MUCH faster than 6000+ small ones!!


LawOfNonContradiction:

NO, I'm not planning to add a spell checker wink

But if Yogui makes a preset and post it here, it will sure be useful for some other users...

Offline

#6 2007-10-10 22:02

Yogui
Member
Registered: 2007-09-27
Posts: 41

Re: Preset to check english misspelings

Hi,

Sorry I did't reply for a while.

I still trying to learn some regular expressions. Starting from zero, after a couple of hours I still not having a clue. tongue

If someone else is interested in this preset litlte proyect (to later on share on this forum) and cant find the list in WIKI let me know and I can mail it.

Cheers, Yogui.

Offline

#7 2007-10-11 16:00

den4b
Administrator
From: den4b.com
Registered: 2006-04-06
Posts: 3,379

Re: Preset to check english misspelings

1. Put all your words into a text file, so you have this:

::abandonned::abandoned
::aberation::aberration
::abilties::abilities
::abilty::ability
::abondon::abandon

2. Replace "::(.+)::(.+)" pattern with "\1*|*", and you should get this:

abandonned*|*
aberation*|*
abilties*|*
abilty*|*
abondon*|*

3. Now replace "\n" with "" (empty), to remove new lines, and get this:

abandonned*|*aberation*|*abilties*|*abilty*|*abondon*|*

4. Repeat the same operation, but this time with change "\1" to "\2", to extract fixed words:

abandoned*|*aberration*|*abilities*|*ability*|*abandon*|*

P.S. Don't forget to remove last delimiter, it is useless.

P.S.2. PLEASE, do not post a rule for 6000+ words as plain test here!
Upload it to some free file hosting, like rapidshare, megaupload, etc.

Offline

#8 2007-10-19 01:43

Yogui
Member
Registered: 2007-09-27
Posts: 41

Re: Preset to check english misspelings

Thanks!
I got it ! but is not that good. roll
May work after a lot of fine tunning.
I think is because the original list replaces the misstypes while you are typing,
But this one check the original string 4748 times (to be precise) against every misstype in alphabetic order.
so it correct's the mistake and very oftenly corrects again thinking is a mistake
ie "andor" any where in the string for "and or"

Silly programs... they just do what they told to do, no what we want from them to do big_smile :-)

If some day I do something usefull with it I'll upload it some whe and post a link.

Cheers, Yogui.
If some one want to play with it I'm happy to mail it
PD: it's 165kb and performance it's not bad... considering what it does.

Offline

#9 2007-11-01 20:58

Stefan
Moderator
From: Germany, EU
Registered: 2007-10-23
Posts: 1,161

Re: Preset to check english misspelings

Maybe Denis could add an option "[ ] Whole Words Only"  for 'Insert'-, 'Remove'-, 'Replace'-Rule

Here an example picture:   ReNamer_WholeWordsOnly.PNG


Read the  *WIKI* for HELP + MANUAL + Tips&Tricks.
If ReNamer had helped you, please *DONATE* to Denis or buy a PRO license. (Read *Lite vs Pro*)

Offline

#10 2007-11-03 14:38

den4b
Administrator
From: den4b.com
Registered: 2006-04-06
Posts: 3,379

Re: Preset to check english misspelings

What exactly the "whole words" option is supposed to do?? hmm

Offline

Board footer

Powered by FluxBB