Corpus info
General stats
Tokens | 957958
|
---|
Words | 858741
|
---|
Types | 23000 |
---|
Lemmas | 15403 |
---|
Hapax legomenon | 10788 |
---|
Dis legomenon | 3456 |
---|
POS tags | 492 |
---|
Documents
Number of documents | 715
|
---|
Average (tokens per document) | 1340 |
---|
Median (tokens per document) | 984 |
---|
Longest document (tokens) | 15662 |
---|
Shortest document (tokens) | 108 |
---|
Oldest document (year) | 1600 |
---|
Most recent document (year) | 1896 |
---|
Group by part of speech
Main POS tag | N | % |
---|
common noun | 209121 | 21.83 |
---|
preposition | 153378 | 16.01 |
---|
determiner | 104146 | 10.87 |
---|
punctuation | 99217 | 10.36 |
---|
verb | 90477 | 9.44 |
---|
numeral | 70160 | 7.32 |
---|
conjunction | 69515 | 7.26 |
---|
adjective | 44485 | 4.64 |
---|
pronoun | 37474 | 3.91 |
---|
proper noun | 33608 | 3.51 |
---|
adverb | 26733 | 2.79 |
---|
untagged | 19279 | 2.01 |
---|
foreign word | 267 | 0.03 |
---|
interjection | 98 | 0.01 |
---|
Total | 957958 | 100.00 |
---|
Group by project
Project | N | % |
---|
CORDEREGRA | 365398 | 38.14 |
---|
HISPATESD | 247580 | 25.84 |
---|
ALEA18 | 176991 | 18.48 |
---|
CORTENEX | 167881 | 17.52 |
---|
_ | 108 | 0.01 |
---|
Total | 957958 | 100.00 |
---|
Group by text type
Text type | N | % |
---|
inventory of goods | 689374 | 71.96 |
---|
witness statement | 192884 | 20.13 |
---|
medical certificate | 71289 | 7.44 |
---|
other | 4411 | 0.46 |
---|
Total | 957958 | 100.00 |
---|
Group by century
Century | N | % |
---|
XVIII | 576984 | 60.23 |
---|
XVII | 270081 | 28.19 |
---|
XIX | 110893 | 11.58 |
---|
Total | 957958 | 100.00 |
---|
Group by province
Province | N | % |
---|
Granada | 167491 | 17.48 |
---|
Almería | 112006 | 11.69 |
---|
Badajoz | 110971 | 11.58 |
---|
Jaén | 106495 | 11.12 |
---|
Madrid | 75183 | 7.85 |
---|
Cádiz | 70772 | 7.39 |
---|
Burgos | 68789 | 7.18 |
---|
Málaga | 68176 | 7.12 |
---|
Cáceres | 61247 | 6.39 |
---|
Sevilla | 33045 | 3.45 |
---|
Huelva | 28388 | 2.96 |
---|
Murcia | 14825 | 1.55 |
---|
Valladolid | 12718 | 1.33 |
---|
La Rioja | 5345 | 0.56 |
---|
Cantabria | 4944 | 0.52 |
---|
Toledo | 3981 | 0.42 |
---|
Palencia | 3785 | 0.40 |
---|
Zamora | 2149 | 0.22 |
---|
Navarra | 2011 | 0.21 |
---|
Álava | 1680 | 0.18 |
---|
Soria | 1444 | 0.15 |
---|
León | 1378 | 0.14 |
---|
Gipuzkoa | 604 | 0.06 |
---|
Teruel | 293 | 0.03 |
---|
Salamanca | 238 | 0.02 |
---|
Total | 957958 | 100.00 |
---|
Group by institution
Institution | N | % |
---|
Archivo de la Real Chancillería de Granada | 237390 | 24.78 |
---|
Archivo Histórico Provincial de Badajoz | 107980 | 11.27 |
---|
Archivo Histórico Provincial de Jaén | 105395 | 11.00 |
---|
Archivo Histórico de Protocolos de Madrid | 73290 | 7.65 |
---|
Archivo Histórico Provincial de Burgos | 66652 | 6.96 |
---|
Archivo Histórico Provincial de Almería | 63774 | 6.66 |
---|
Archivo Histórico Provincial de Cáceres | 59901 | 6.25 |
---|
Archivo Histórico de Protocolos de Granada | 49571 | 5.17 |
---|
Archivo Histórico Provincial de Cádiz | 49534 | 5.17 |
---|
Archivo de la Real Chancillería de Valladolid | 44567 | 4.65 |
---|
Archivo Histórico Provincial de Huelva | 27641 | 2.89 |
---|
Archivo Histórico Provincial de Sevilla | 24702 | 2.58 |
---|
Archivo Histórico Municipal de Lorca | 14825 | 1.55 |
---|
Archivo Municipal de Puerto Real | 14253 | 1.49 |
---|
Archivo Histórico Provincial de Málaga | 11945 | 1.25 |
---|
Archivo Municipal de Vera | 5757 | 0.60 |
---|
Archivo Histórico Municipal de Loja | 781 | 0.08 |
---|
Total | 957958 | 100.00 |
---|
Group by century and province (absolute frequencies)
| XV | XVI | XVII | XVIII | XIX | Total (province) | Total (area) |
---|
Almería | | | 26576 | 47666 | 37764 | 112006 | 385992 |
---|
Granada | | | 50902 | 113960 | 2629 | 167491 |
---|
Jaén | | | | 106495 | | 106495 |
---|
Málaga | | | 26419 | 33732 | 8025 | 68176 | 68176 |
---|
Córdoba | | | | | | 0 |
---|
Cádiz | | | | 70664 | 108 | 70772 | 132205 |
---|
Sevilla | | | 168 | 32877 | | 33045 |
---|
Huelva | | | | 28388 | | 28388 |
---|
Madrid | | | | 36074 | 39109 | 75183 | 143972 |
---|
Burgos | | | | 67449 | 1340 | 68789 |
---|
others | | | 166016 | 39679 | 21918 | 227613 | 227613 |
---|
Total (century) | 0 | 0 | 270081 | 576984 | 110893 | 957958 | 957958 |
---|
Group by century and province (relative frequencies)
| XV | XVI | XVII | XVIII | XIX | Total (province) | Total (area) |
---|
Almería | 0.00 | 0.00 | 2.77 | 4.98 | 3.94 | 11.69 | 40.29 |
---|
Granada | 0.00 | 0.00 | 5.31 | 11.90 | 0.27 | 17.48 |
---|
Jaén | 0.00 | 0.00 | 0.00 | 11.12 | 0.00 | 11.12 |
---|
Málaga | 0.00 | 0.00 | 2.76 | 3.52 | 0.84 | 7.12 | 7.12 |
---|
Córdoba | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
---|
Cádiz | 0.00 | 0.00 | 0.00 | 7.38 | 0.01 | 7.39 | 13.80 |
---|
Sevilla | 0.00 | 0.00 | 0.02 | 3.43 | 0.00 | 3.45 |
---|
Huelva | 0.00 | 0.00 | 0.00 | 2.96 | 0.00 | 2.96 |
---|
Madrid | 0.00 | 0.00 | 0.00 | 3.77 | 4.08 | 7.85 | 15.03 |
---|
Burgos | 0.00 | 0.00 | 0.00 | 7.04 | 0.14 | 7.18 |
---|
others | 0.00 | 0.00 | 17.33 | 4.14 | 2.29 | 23.76 | 23.76 |
---|
Total (century) | 0.00 | 0.00 | 28.19 | 60.23 | 11.58 | 100.00 | 100.00 |
---|
Measures of lexical diversity
Measure | Description | Formula | Result |
---|
TTR | type-token ratio | | 0.027 |
---|
RTTR | Giraud's root type-token ratio | | 24.820 |
---|
CTTR | Carroll's corrected type-token ratio | | 17.550 |
---|
C | Herdan's C index | | 0.735 |
---|
S | Somer's S index | | 0.882 |
---|
M | Maas' index | | 0.036 |
---|
H | Honoré's index | | 2573.322 |
---|
K | Yule's K index | | 172.143 |
---|
D | Simpson's D index | | 0.017 |
---|
HTR | Hapax-token ratio | | 0.469 |
---|
DTR | Dis-token ratio | | 0.150 |
---|
VGR | Vocabulary growth rate | | 0.013 |
---|