Announcing French Phonetic Dictionary project – looking for volunteers

The project is dropped - read why

According to statistics, the French phonetic translator is the most frequently used resource on EasyPronunciation.com. If you used it, I think you noticed that sometimes it gives erroneous results. In order to improve it, I decided to create French Phonetic Dictionary project. Before I go into details. I would like to explain you a little bit how my French translator works.

First of all, I use a dictionary that contains phonetic transcription of main forms of French words in the following format:

  • prononcer_pʀɔnɔ̃se
  • prononciation_pʀɔnɔ̃sjasjɔ̃

Let's call it French Phonetic Dictionary (FPD). I used miscellaneous sources to create this dictionary. In this dictionary, the phonetic transcription is missing for some words (especially for proper nouns) and contains errors for others. This dictionary is the primary source of errors in conversion.

Second, the translator uses a very complex set of rules on how to modify the phonetic transcription of the main form in order to obtain the right transcription of the modified form. For that, I used free French dictionary Dicollecte. Although these rules are very numerous, they cause only occasional bugs. These bugs are usually very easy to fix.

As you can see, if we eliminate errors from the French Phonetic Dictionary, the overall quality of the conversion will be greatly improved. To remove the errors, the dictionary should be proofread by real people. And for that, I need volunteers. The question is how to motivate people to do that?

My guess is that people who may be interested in this project would like to see the results of their work be available to everyone for free download. For that, I suggest that this dictionary be published on EasyPronunciation.com under a Creative Commons Attribution-Share Alike 4.0 License. That means that people will be allowed to use this dictionary for both non-commercial and commercial purposes provided that they mention where they obtained this data from and in case they improve or add data to the dictionary they will share these changes under the same license.

To give you an idea of what needs to be done, the dictionary contains 71,405 entries. My rough estimate of error percentage is 5% for regular words and 95% for proper nouns.

If you want to participate, please contact me. I will send you a text file containing a part of the dictionary. You will need to find and correct any errors and add missing transcriptions. As soon as you send me the proofread file back, I will update my database and publish the proofread parts of the dictionary here under CC-BY-SA 4.0 license mentioned before.

If you have any questions or suggestions concerning this project, please feel free to contact me.

Update: the dictionary is available for download here.

Tags: French, phonetic dictionary, French phonetics, French pronunciation, phonetic transcription, IPA