The Switchboard Telephone Speech Corpus is a corpus of spokenEnglish language consisted of almost 260 hours of speech. It was created in 1990 by Texas Instruments via a DARPA grant, and released in 1992 by NIST. The corpus contains 2,400 telephone conversations among 543 US speakers (302 male, 241 female).[1][2][3] Participants did not know each other, and conversations were held on topics from a predetermined list.[4]
Switchboard-2 Phase II was collected in 1999 and includes "4,472 five-minute telephone conversations involving 679 participants".[5]
A: All right um well [laughter-uh] let's see i'm twenty
B: How old are you Lisa. Okay that i'm older
A: Yeah how old are you. Older [laughter]
B: Older than you [laughter-are]
A: [laughter-okay]
B: Okay we are supposed to talk about places we like to go so i'm gonna and where are you from where are you calling from?
A: I'm calling from uh Provo Utah but I'm from Plano Texas
B: Oh you are from Plano my sister lives in Plano yes her husband is the new Director of Admissions at uh University of Texas at Dallas
A: Oh really. Oh wow my dad used to work at UTD also
B: Yeah so I [vocalized-noise]. Anyway so where's your favorite place to go?
A: Um. Generally we just go on family vacations to Arizona my grandparents live there that's generally our usual summer vacation
^Soni, Mayank; Spillane, Brendan; Gilmartin, Emer; Saam, Christian; Cowan, Benjamin R.; Wade, Vincent (2021). "An Empirical Study of Topic Transition in Dialogue". arXiv:2111.14188 [cs.CL].