Briefly, the most significant differences between the spoken and written varieties of any language including English is formality versus spontaneity, prescriptive correctness versus descriptive changes, and history of each. Starting with the last first, in the history of humankind and in the life history of each individual, spoken language comes first and does not require being institutionally taught.
Humankind started speaking at some as yet imperfectly marked date. Humankind began writing much later than that in Sumeria around ~3,500 B.C. Babies almost universally begin speaking before they are two years old. They begin writing some years later under the influence of some sort of dedicated education (Does Big Bird count as dedicated education?). Further, speech records history as a series of memorized oral traditions, stories, legends, and myths that carry an inherent possibility for variation, whereas written language more accurately records stories, myths, legends, traditions, and higher order conceptual writings like maths and philosophy with a presumed lessened inherent possibility for variation of texts over time.
To return to the first, formality versus spontaneity, written language, even before standardized, has always required an fundamentally unvarying form that could be comprehended with ease and reliability by other readers of that language. This form in texts is constructed in writing in an isolated environment and may have corrected errors or may take revisions to communicate effectively. Spoken language must communicate essentially effectively at the moment it is utter and for that reason has tags and fillers that may buy time while thoughts are mentally prepared for the presentation of an organized message.
Related to this is the difference between prescriptive correctness and descriptive changeability. In writing, especially in higher orders of formal writing, but nonetheless to some degree in all writing, prescribed rules of correctness must be followed for the importance of coherently communicating a specialized message that must stand on its own without aid of situational cues, facial expression cues, or tonal expression cues for its meaning. In spoken language, speakers can--and indeed do--make changes to their speech, or a community of speakers may make group changes to their speech, that are not prescriptive (aside from social prestige or negative prestige pressures) in any way but may be measured descriptively to record--and describe--changes.