A range of electronic corpora has become increasingly accessible via the WWW and CD-ROM. This development has coincided with improvements in the standards governing the collecting, encoding and archiving of such data. Less attention, however, has been paid to making other types of digital data available - especially that which one might describe as 'unconventional', namely, dialects, child language and bilingual databases. Advances in technology have enabled the collection and organisation of such data sets into a growing number of user-friendly electronic corpora. The latter have the potential to offer new insights into linguistic universals, for instance, since they allow, for the first time, rapid and systematic comparisons between first and second language/dialects across both social and geographical space. This book provides state-of-the-art methods and guidelines for creating and digitising these resources taking full advantage of the dramatic recent improvements in computing and analytical tools.