Bidumsáme Báhkogirrje

Pitesamisk ordlista • Pite Saami lexical database

This webpage provides documentation for version v1.0 of the Pite Saami lexical database as stored in the .json file available via the repository of the Center for Estonian Language Resources or behind DOI 10.15155/z0p6-k519.

Use of the database is governed by a CC-BY-NC-SA 4.0 license. For more information, contact Joshua Wilbur (joshua.wilbur@ut.ee), Institute of Estonian and General Linguistics, University of Tartu, Estonia. A searchable web-based instance of the most current version is available here.

The resource should be cited as:
Wilbur, Joshua, Peter Steggo, Olve Utne, Nils-Henrik Bengtsson, Marianne Eriksson, Inger Fjällås, Eva-Karin Rosenberg, Gry Helen Sivertsen, Valborg Sjaggo & Dagny Skaile. 2024. The Pite Saami Lexical Database. version 1.0. Center for Estonian Language Resources. DOI: 10.15155/z0p6-k519.

This .json file uses its own set of names for the key-value pairs; these names as well as the overall structure of the file are documented here.

Here is an example of the entry for the word guolle as it is structured in the json-database file:

{
  "no": "695",
  "sje": "guolle",
  "PoS": "subst",
  "Cgrad": "ll-l",
  "syllStruc": "CVCCV",
  "syllCount": "2",
  "ENG": "fish",
  "SVE": "fisk"
},

This entry shows that the ID number is 695, the Pite Saami orthographic representation (sje) is guolle, the part of speech (PoS) is subst (noun), the consontant gradation pattern (Cgrad) is ll-l, the syllable structure (syllStruc) is CVCCV, it has a total of 2 syllables (syllCount), and it translates to fish in English and fisk in Swedish.

Overall structure
The json file's root node contains one child node for metadata about the database (the initial child node) and one child node for each of the 7661 entries in the database. Each lexical entry node is flat and consists of maximally the key-value pairs listed in the table below, although most entries only contain a subset of these keys, as in the example above. All values are stored as strings, even those which could be considered integers.

Key nameDescription
nounique entry number from the source database
sjePite Saami entry (using Pite Saami orthography)
PoSpart of speech with possible values: [verb, adj (adjective), subst (noun), adv (adverb), konj (conjunction), egen (proper noun), post (postposition), pro (pronoun), dem (demonstrative), ptkl (particle), fråg (question word), num (numeral); fras (phrase), änd (suffix), utr (expression)] (NB: the final three values do not actually correspond to part of speech but are an entry type to classify entries from the origial source wordlist from the project Insamling av pitesamiska ord)
Cgradconsonant gradation pattern
umlautumlaut pattern
stemExtfinal stem consontant extension pattern (includes the entire stem in cases with a stem-final vowel alternation)
ENGEnglish translation
SVESwedish translation
syllStrucsyllable structure [C = consontant, V = vowel]
syllCountnumber of syllables
morphCatsmorphological categories for entries which are not the citation form of the lexeme; more-less based on Swedish terms (e.g., ack for accustaive, komp for comparative) and Giellatekno standards
adjTypeadjective type (attributive or predicative), with possible values: [ATTR, PRED, ATTR/PRED]
variant2headwordindicates the main variant if the current entry is not the main variant of the lemma
variantOfRefNoindicates the entry number for the main variant of the lemma if the current entry is not the main variant

last updated: 2025-05-07