LIS 4/5523: Online Information Retrieval
Data and information generation in every discipline in the universe of knowledge has seen staggering growth
Storing, managing, querying, & retrieval of huge amount of data & information needs sophisticated procedures & advanced technologies
Nowadays, information collection is web-based and online which is vast and growing at an exponential rate


Definition
A process in which sets of records or documents are searched to find items which may help to satisfy an information need
Information Retrieval includes:

J.W. Sammon (1969) gave the idea of visualization interface integrated to an IR system in his famous paper “A nonlinear mapping for data structure analysis”
First online systems–NLM’s AIM-TWX, MEDLINE; Lockheed’s Dialog; SDC’s ORBIT

AM SIGIR Conference started in 1978 which subsequently emerged as the apex conference in IR systems
Belkin, Oddy, and Brooks gave the concept of Anomalous State of Knowledge (ASK) for information retrieval in 1982
OKAPI model was formulated in 1982-88 which is a set-oriented ranked output design for probabilistic type retrieval of textual material using inverted index
Major breakthrough was in 1989 when Tim Berners-Lee proposed World Wide Web in CERN Laboratory
TREC conference started as part of TIPSTER text program in 1992 and it was sponsored by US Defense and National Institute of Standards and Technology (NIST)
PageRank algorithm was developed at Stanford University by Larry Page and Sergey Brin in 1996
In 1997, Google Inc. was born which has now ruling dominantly in searching engine domain
Google personalized search started in 2005
Multimedia IR (Smeulders, Lew, Sebe) integrates into search in 2010
Semantic models came first in 2013-2014 such as Word2Vec, GloVe
Google introduces BERT in 2018
Conversational IR in assistants were introduced in 2020-2021 such as Alexa, Siri
Retrieval Augmented Genreration in 2022-2023

LSI gained huge popularity in WWW and was hugely used in Search Engine Optimization (SEO)
Latent Dirichlet allocation (LDA), a generative/topic model in NLP was developed by David Blei, Andrew NG, and Michael Jordan in 2003
A user is a person who uses information and/or information systems in some meaningful way
A user can be:
Users are motivated to seek information in a given situation to:
Typical user questions:
Two broad categories of searches:
A specialized system for the description, storage, and retrieval of information representations: primarily information objects (text, images) and their surrogates (metadata, records). Operates by matching queries (representations of information need) with data (representations of information objects)
Knowledge system into which an IR system is implanted generally consists of three main components:
people in their role as information-processors
documents in their role as carriers of information
topics as representations

Based on the different types of services, IR can be categorized as:
How is access to the Internet changing our user’s …………….?
Expectations
Ways of engaging with information
Brains
Information needs/seeking activities and behaviors
Other thoughts?
