Speech production is not the same as language production since language can also be produced manually by signs.
In ordinary fluent conversation people pronounce each second roughly four syllables, ten or twelve phonemes and two to three words out of a vocabulary that can contain 10 to 100 thousand words. Errors in speech production are relatively rare occurring at a rate of about once in every 900 words in spontaneous speech. Words that are commonly spoken or learned early in life or easily imagined are quicker to say than ones that are rarely said, learnt later in life or abstract.
The production of spoken language involves three major levels of processing: conceptualization formulation, and articulation.
The first is the processes of conceptualization or conceptual preparation, in which the intention to create speech links a desired concept to a particular spoken word to be expressed. Here the preverbal intended messages are formulated that specify the concepts to be verbally expressed.
The second stage is formulation in which the linguistic form required for the expression of the desired message is created. Formulation includes grammatical encoding, morpho-phonological encoding, and phonetic encoding. Grammatical encoding is the process of selecting the appropriate syntactic word or lemma. The selected lemma then activates the appropriate syntactic frame for the conceptualized message. Morpho-phonological encoding is the process of breaking words down into syllables to be produced in overt speech. This syllabification is dependent on the preceding and proceeding words, for instance: I-com-pre-hend vs. I-com-pre-hen-dit. The final part of the formulation stage is phonetic encoding. This involves the activation of articulatory gestures dependent on the syllables selected in the morpho-phonological process, creating an articulatory score as the utterance is pieced together and the order of movements of the vocal apparatus is completed.
The third stage of speech production is articulation which is the execution of the articulatory score by the lungs, glottis, larynx, tongue, lips, jaw and other parts of the vocal apparatus resulting in overt speech.
^Jescheniak, JD; Levelt, WJM (1994). "Word frequency effects in speech production: retrieval of syntactic information and of phonological form". Journal of Experimental Psychology: Learning, Memory, and Cognition20 (4): 824–843. doi:10.1037/0278-73188.8.131.524. CiteSeerX: 10.1.1.133.3919.
^ abcdeLevelt, W. (1999). "The neurocognition of language", p.87 -117. Oxford Press
^ abAckermann, H (2008). "Cerebellar contributions to speech production and speech perception: psycholinguistic and neurobiological perspectives". Trends in Neurosciences31 (6): 265–72. doi:10.1016/j.tins.2008.02.011. PMID18471906.