LIS 5043: Organization of Information
A system for choosing or highlighting some characteristics (attributes), together with a specification of the rules for selection (codes)
This implies a trade-off: if some characteristics are highlighted, other characteristics are left behind
ENTITES: objects or conceptsATTRIBUTES: characteristics of entities
DIACHRONIC: stable across timeSYNCHRONIC: changes across time






Indexer has selected (perhaps among others) the concept that the patrons will want


Indexer picks a different topicIndexer and patron use different terms for the same conceptPatrons cannot articulate just what the question state isIndexer
describes doc
predicts use
Patron
describes doc
predicts doc
What patron attributes can we know?What document attributes can we know?How can we use this knowledge to open the bottleneck between patrons in need and the documents that might be of use?1.1 Consistency
1.2 Subject Expertise
1.3 Indexing Expertise
2.1 Searching Experience
2.2 Domain Knowledge
3.1 Motivation Level
3.2 Emotional State
Use of Standards/Rules (code)
Depends on Resources/Audience
A process in which sets of records or documents are searched to find items which may help to satisfy the information need
IR is concerned with:
representationstorageorganizationaccessing of information objectsUser Group
Information NeedInformation SourcesInformation System
Results of the QueryUser Selection & Evaluation (Relevance)Most IR is based on techniques introduced in the 1960's
IR is no longer just a library problem
As a result of these evolved uses high standards of retrieval are expected by usersWe can divide IR techniques into basic classes
Simple Match ModelRequest = Information Data
Document A = data, information
Document B = data, information
Document C = information, retrieval
Advantages: simple process; widespread; familiar
Disadvantages: single descriptor requests less effective in large databases
AND, OR, and NOT to allow more complex queries to the IR systemSet TheoryExample of Boolean Search

Weighted IR (probabilistic IR)
Semantic or Linguistic Model (NLP)attempts to get at the “concepts” contained in the information object or the surrogate
syntactic analysis
free text searching
paragraph indexing
discourse analysis
Passage Retrieval
Online catalogs
Online databases
Web Search Engines
