2008 © chengxiang zhai 1 introduction to research chengxiang zhai department of computer science...
TRANSCRIPT
![Page 1: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/1.jpg)
2008 © ChengXiang Zhai 1
Introduction to Research
ChengXiang ZhaiDepartment of Computer Science
University of Illinois, Urbana-Champaign
http://www-faculty.cs.uiuc.edu/~czhai, [email protected]
![Page 2: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/2.jpg)
2008 © ChengXiang Zhai 2
Outline
1.What is research?
2.How to prepare yourself for IR research?
3.How to identify and define a good IR research problem?
4.How to formulate and test IR research hypotheses?
5.How to write and publish an IR paper?
![Page 4: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/4.jpg)
2008 © ChengXiang Zhai 4
What is Research?• Research
– Discover new knowledge
– Seek answers to questions
• Basic research– Goal: Expand man’s knowledge (e.g., which genes control social
behavior of honey bees? )
– Often driven by curiosity (but not always)
– High impact examples: relativity theory, DNA, …
• Applied research– Goal: Improve human condition (i.e., improve the wolrd) (e.g.,
how to cure cancers?)
– Driven by practical needs
– High impact examples: computers, transistors, vaccinations, …
• The boundary is vague; distinction isn’t important
![Page 5: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/5.jpg)
2008 © ChengXiang Zhai 5
Why Research?
Amount of knowledge
Advancement of Technology
Utility of Applications
Quality of Life
Basic ResearchApplied Research
ApplicationDevelopment
Curiosity
![Page 6: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/6.jpg)
2008 © ChengXiang Zhai 6
Where’s IR Research?
Amount of knowledge
Advancement of Technology
Utility of Applications
Quality of Life
Basic ResearchApplied Research
ApplicationDevelopment
Information Science
Computer Science
![Page 7: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/7.jpg)
2008 © ChengXiang Zhai 7
Where’s Your Position?
Amount of knowledge
Advancement of Technology
Utility of Applications
Quality of Life
Basic ResearchApplied Research
ApplicationDevelopment
Different position benefits from different collaborators
![Page 8: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/8.jpg)
2008 © ChengXiang Zhai 8
Research Process
• Identification of the topic (e.g., Web search)
• Hypothesis formulation (e.g., algorithm X is better than Y=state-of-the-art)
• Experiment design (measures, data, etc) (e.g., retrieval accuracy on a sample of web data)
• Test hypothesis (e.g., compare X and Y on the data)
• Draw conclusions and repeat the cycle of hypothesis formulation and testing if necessary (e.g., Y is better only for some queries, now what?)
![Page 9: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/9.jpg)
2008 © ChengXiang Zhai 9
Typical IR Research Process
• Look for a high-impact topic (basic or applied)
• New problem: define/frame the problem
• Identify weakness of existing solutions if any
• Propose new methods
• Choose data sets (often a main challenge)
• Design evaluation measures (can be very difficult)
• Run many experiments (need to have clear research hypotheses)
• Analyze results and repeat the steps above if necessary
• Publish research results
![Page 10: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/10.jpg)
2008 © ChengXiang Zhai 10
Research Methods
• Exploratory research: Identify and frame a new problem (e.g., “a survey/outlook of personalized search”)
• Constructive research: Construct a (new) solution to a problem (e.g., “a new method for expert finding”)
• Empirical research: evaluate and compare existing solutions (e.g., “a comparative evaluation of link analysis methods for web search”)
• The “E-C-E cycle”: exploratoryconstructiveempiricalexploratory…
![Page 11: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/11.jpg)
2008 © ChengXiang Zhai 11
Types of Research Questions and Results
• Exploratory (Framework): What’s out there?
• Descriptive (Principles): What does it look like? How does it work?
• Evaluative (Empirical results): How well does a method solve a problem?
• Explanatory (Causes): Why does something happen the way it happens?
• Predictive (Models): What would happen if xxx ?
![Page 12: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/12.jpg)
2008 © ChengXiang Zhai 12
Solid and High Impact Research
• Solid work: – A clear hypothesis (research question) with conclusive result
(either positive or negative)
– Clearly adds to our knowledge base (what can we learn from this work?)
– Implications: a solid, focused contribution is often better than a non-conclusive broad exploration
• High impact = high-importance-of-problem * high-quality-of-solution– high impact = open up an important problem
– high impact = close a problem with the best solution
– high impact = major milestones in between
– Implications: question the importance of the problem and don’t just be satisfied with a good solution, make it the best
![Page 14: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/14.jpg)
2008 © ChengXiang Zhai 14
What It Takes to Do Research
• Curiosity: allow you to ask questions
• Critical thinking: allow you to challenge assumptions
• Learning: take you to the frontier of knowledge
• Persistence: so that you don’t give up
• Respect data and truth: ensure your research is solid
• Communication: allow you to publish your work
• …
![Page 15: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/15.jpg)
2008 © ChengXiang Zhai 15
Learning about IR
• Start with an IR text book (e.g., Manning et al., Grossman & Frieder, a forth-coming book from UMass,…)
• Then read “Readings in IR” by Karen Sparck Jones, Peter Willett
• And read papers recommended in the following article: http://www.sigir.org/forum/2005D/2005d_sigirforum_moffat.pdf
• Read other papers published in recent IR/IR-related conferences
• Take advantage of online resources (e.g., http://timan.cs.uiuc.edu/resources)
![Page 16: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/16.jpg)
2008 © ChengXiang Zhai 16
Learning about IR (cont.)
• Getting more focused – Choose your favorite sub-area (e.g., retrieval models)
– Extend your knowledge about related topics (e.g., machine learning, statistical modeling, optimization)
• Stay in frontier:– Keep monitoring literature in both IR and related areas
• Broaden your view: Keep an eye on – Industry activities
• Read about industry trends
• Try out novel prototype systems
– Funding trends
• Read request for proposals
![Page 17: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/17.jpg)
2008 © ChengXiang Zhai 17
Critical Thinking
• Develop a habit of asking questions, especially why questions
• Always try to make sense of what you have read/heard; don’t let any question pass by
• Get used to challenging everything
• Practical advice
– Question every claim made in a paper or a talk (can you argue the other way?)
– Try to write two opposite reviews of a paper (one mainly to argue for accepting the paper and the other for rejecting it)
– Force yourself to challenge one point in every talk that you attend and raise a question
![Page 18: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/18.jpg)
2008 © ChengXiang Zhai 18
Respect Data and Truth
• Be honest with the experiment results
– Don’t throw away negative results!
– Try to learn from negative results
• Don’t twist data to fit your hypothesis; instead, let the hypothesis choose data
• Be objective in data analysis and interpretation; don’t mislead readers
• Aim at understanding/explanation instead of just good results
• Be careful not to over-generalize (for both good and bad results); you may be far from the truth
![Page 19: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/19.jpg)
2008 © ChengXiang Zhai 19
Communications
• General communication skills:
– Oral and written
– Formal and informal
– Talk to people with different level of backgrounds
• Be clear, concise, accurate, and adaptive (elaborate with examples, summarize by abstraction)
• English proficiency
• Get used to talking to people from different fields
![Page 20: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/20.jpg)
2008 © ChengXiang Zhai 20
Persistence
• Work only on topics that you are passionate about
• Work only on hypotheses that you believe in
• Don’t draw negative conclusions prematurely and give up easily
– positive results may be hidden in negative results
– In many cases, negative results don’t completely reject a hypothesis
• Be comfortable with criticisms about your work (learn from negative reviews of a rejected paper)
• Think of possibilities of repositioning a work
![Page 21: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/21.jpg)
2008 © ChengXiang Zhai 21
Optimize Your Training
• Know your strengths and weaknesses
– strong in math vs. strong in system development
– creative vs. thorough
– …
• Train yourself to fix weaknesses
• Find strategic partners
• Position yourself to take advantage of your strengths
![Page 22: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/22.jpg)
2008 © ChengXiang Zhai 22
Part 3. How to identify and define a good IR research problem?
![Page 23: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/23.jpg)
2008 © ChengXiang Zhai 23
What is a Good Research Problem?
• Well-defined: Would we be able to tell whether we’ve solved the problem?
• Highly important: Who would care about the solution to the problem? What would happen if we don’t solve the problem?
• Solvable: Is there any clue about how to solve it? Do you have a baseline approach? Do you have the needed resources?
• Matching your strength: Are you at a good position to solve the problem?
![Page 24: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/24.jpg)
2008 © ChengXiang Zhai 24
Challenge-Impact Analysis
Level of Challenges
Impact/Usefulness
Known
UnknownGood applications
Not interestingfor research
High impactLow risk (easy)
Good short-termresearch problems
High impactHigh risk (hard)Good long-term
research problemsDifficult
basic researchProblems,
but questionable impact
Low impactLow risk
Bad research problems(May not be publishable)
“entry point” problems
![Page 25: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/25.jpg)
2008 © ChengXiang Zhai 25
Optimizing “Research Return”:Pick a Problem Best for You
Your Passion
High (Potential)
Impact
Your Strength
Best problems for you
Find your passion: If you don’t have to work/study for money, what would you do?
Test of impact: If you are given $1M to fund a research project, what would you fund?
Find your strength/Avoid your weakness: What are you (not) good at?
![Page 26: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/26.jpg)
2008 © ChengXiang Zhai 26
How to Find a Problem?
• Application-driven (Find a nail, then make a hammer)
– Identify a need by people/users that cannot be satisfied well currently (“complaints” about current data/information management systems?)
– How difficult is it to solve the problem?
• No big technical challenges: do a startup
• Lots of big challenges: write a research proposal
– Identify one technical challenge as your topic
– Formulate/frame the problem appropriately so that you can solve it
• Aim at a completely new application/function (find a high-stake nail)
![Page 27: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/27.jpg)
2008 © ChengXiang Zhai 27
How to Find a Problem? (cont.) • Tool-driven (Hold a hammer, and look for a nail)
– Choose your favorite state-of-the-art tools • Ideally, you have a “secret weapon”
• Otherwise, bring tools from area X to area Y
– Look around for possible applications
– Find a novel application that seems to match your tools
– How difficult is it to use your tools to solve the problem? • No big technical challenges: do a startup
• Lots of big challenges: write a research proposal
– Identify one technical challenge as your topic
– Formulate/frame the problem appropriately so that you can solve it
• Aim at important extension of the tool (find an unexpected application and use the best hammer)
![Page 28: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/28.jpg)
2008 © ChengXiang Zhai 28
How to Find a Problem? (cont.)
• In practice, you do both in various kinds of ways
– You talk to people in application domains and identify new “nails”
– You take courses and read books to acquire new “hammers”
– You check out related areas for both new “nails” and new “hammers”
– You read visionary papers and the “future work” sections of research papers, and then take a problem from there
– …
![Page 29: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/29.jpg)
2008 © ChengXiang Zhai 29
Three Basic Questions to Ask about an IR Problem
• Who are the users?– Everyone vs. Small group of people
• What data do we have?– Web (whole web vs. sub-web)
– Email (public email vs. personal email)
– Literature (general vs. special discipline)
– Blog, forum, …
• What functions do we want to support?– Information access vs. knowledge acquisition
– Decision and task support
Everyone (who has an Internet connection)
The whole web (indexed by Google)
Search (by keywords)
![Page 30: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/30.jpg)
2008 © ChengXiang Zhai 30
Look for New IR Research Questions
• Driven by new data: X is a new type of data emerging (e.g., X= blog vs. news)– How is X different from existing types of data?
– What new issues/problems are raised by X?
– Are existing methods sufficient for solving old problems on X? If not, what are the new challenges?
– What new methods are needed?
– Are old evaluation measures adequate?
• Driven by new users: Y is a set of new users (e.g., ordinary people vs. librarians)– How are the new users different from old ones? What new needs do they have?
– Can existing methods work well to satisfy their needs? If not, what are the new challenges?
– What new functions are appropriate for Y?
• Driven by new tasks (not necessarily new users or new data): Z is a new task (e.g., social networking, online shopping)
– What information management functions are needed to better support Z?
– Can these new functions reduced to old ones? If not, what are the new challenges?
![Page 31: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/31.jpg)
2008 © ChengXiang Zhai 31
Map of IR Applications
Web pages
News articles
Email messages
Literature
Organization docs
Legal docs/Patents
Medical records
Customer complaint letter/transcripts
…
KidsPeking Univ. community
Lawyers Scientists
Search Browsing Alert MiningTask/Decision
support
CustomerServicePeople
Email management+ automatic reply
“Google Kids”
Legal InfoSystems
LiteratureAssistant
IntranetSearch
LocalWeb
Service
Blog articles
OnlineShoppers
?
![Page 32: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/32.jpg)
2008 © ChengXiang Zhai 32
High-Level Challenges in IR
• How to make use of imperfect IR techniques to do something useful?
– Save human labor (e.g., partially automate a task)
– Create “add on” value (e.g., literature alert)
– A lot of HCI issues (e.g., allowing users to control)
• How to develop robust, effective, and efficient methods for a particular application?
– Methods need to “work all the time” without failure
– Methods need to be accurate enough to be useful
– Methods need to be efficient enough to be useful
![Page 33: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/33.jpg)
2008 © ChengXiang Zhai 33
Challenge 1: From Search to Information Access
• Search is only one way to access information
• Browsing and recommendation are two other ways
• How can we effectively combine these three ways to provided integrated information access?
• E.g., artificially linking search results with additional hyperlinks, “literature pop-ups”…
![Page 34: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/34.jpg)
2008 © ChengXiang Zhai 34
Challenge 2: From Information Access to Task Support
• The purpose of accessing information is often to perform some tasks
• How can we go beyond information access to support a user at the task level?
• E.g., automatic/semi-automatic email reply for customer service, literature information service for paper writing (suggest relevant citations, term definitions, etc), comparing prices for shoppers
![Page 35: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/35.jpg)
2008 © ChengXiang Zhai 35
Challenge 3: Support Whole Life Cycle of Information
• A life cycle of information consists of “creation”, “storage”, “transformation”, “consumption”, “recycling”, etc
• Most existing applications support one stage (e.g., search supports “consumption”)
• How can we support the whole life cycle in an integrated way?
• E.g., Community publication/subscription service (no need for crawling, user profiling)
![Page 36: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/36.jpg)
2008 © ChengXiang Zhai 36
Challenge 4: Collaborative Information Management
• Users (especially similar users) often have similar information need
• Users who have explored the information space can share their experiences with other users
• How to exploit the collective expertise of users and allow users to help each other?
• E.g., allowing “information annotation” on the Web (“footprints”), collaborative filtering/retrieval,
![Page 37: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/37.jpg)
2008 © ChengXiang Zhai 37
General Steps to Define a Research Problem
• Generate and Test
• Raise a question
• Novelty test: Figure out to what extent we know how to answer the question– There’s already an answer to it: Is the answer good enough?
• Yes: not interesting, but can you make the question more challenging?
• No: your research problem is how to get a better answer to the raised question
– No obvious answer: you’ve got an interesting problem to work on
• Tractability test: Figure out whether the raised question can be answered – I can see a way to answer it or potentially answer it: you’ve got a solvable
problem
– I can’t easily see a way to answer it: Is it because the question is too hard or you’ve not worked hard enough? Try to reframe the problem to make it easier
• Evaluation test: Can you obtain a data set and define measures to test solutions/answers?
– Yes: you’ve got a clearly defined problem to work on
– No: can you think of anyway to indirectly test the solutions/answers? Can you reframe the problem to fit the data?
• Every time you reframe a problem, try to do all the three tests again.
![Page 38: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/38.jpg)
2008 © ChengXiang Zhai 38
Rigorously Define Your Research Problem
• Exploratory: what is the scope of exploration? What is the goal of exploration? Can you rigorously answer these questions?
• Descriptive: what does it look like? How does it work? Can you formally define a principle?
• Evaluative: can you clearly state the assumptions about data collection? Can you rigorously define measures?
• Explanatory: how can you rigorously verify a cause?
• Predictive: can you rigorously define what prediction is to be made?
![Page 39: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/39.jpg)
2008 © ChengXiang Zhai 39
Frame a New Computation Task
• Define basic concepts
• Specify the input
• Specify the output
• Specify any preferences or constraints
![Page 40: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/40.jpg)
2008 © ChengXiang Zhai 40
From a new application to a clearly defined research problem
• Try to picture a new system, thus clarify what new functionality is to be provided and what benefit you’ll bring to a user
• Among all the system modules, which are easy to build and which are challenging?
• Pick a challenge and try to formalize the challenge– What exactly would be the input?
– What exactly would be the output?
• Is this challenge really a new challenge (not immediately clear how to solve it)?– Yes, your research problem is how to solve this new problem
– No, it can be reduced to some known challenge: are existing methods sufficient?
• Yes, not a good problem to work on
• No, your research problem is how to extend/adapt existing methods to solve your new challenge
• Tuning the problem
![Page 41: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/41.jpg)
2008 © ChengXiang Zhai 41
Tuning the Problem
Level of Challenges
Impact/Usefulness
Known
Unknown
Make a hard problem easier
Make an easy problem harder
Increase impact (more general)
![Page 42: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/42.jpg)
2008 © ChengXiang Zhai 42
“Short-Cut” for starting IR research• Scan most recently published papers to find papers that you like or can
understand
• Read such papers in detail
• Track down background papers to increase your understanding
• Brainstorm ideas of extending the work
– Start with ideas mentioned in the future work part
– Systematically question the solidness of the paper (have the authors answered all the questions? Can you think of questions that aren’t answered?)
– Is there a better formulation of the problem
– Is there a better method for solving the problem
– Is the evaluation solid?
• Pick one new idea and work on it
![Page 43: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/43.jpg)
2008 © ChengXiang Zhai 43
Part 4. How to formulate and test IR research hypotheses?
![Page 44: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/44.jpg)
2008 © ChengXiang Zhai 44
Formulate Research Hypotheses• Typical hypotheses in IR:
– Hypothesis about user characteristics (tested with user studies or user-log analysis, e.g., clickthrough bias)
– Hypothesis about data characteristics (tested with fitting actual data, e.g., Zipf’s law)
– Hypothesis about methods (tested with experiments):• Method A works (or doesn’t work) for task B under condition C by
measure D (feasibility)
• Method A performs better than method A’ for task B under condition C by measure D (comparative)
• Introduce baselines naturally lead to hypotheses
• Carefully study existing literature to figure our where exactly you can make a new contribution (what do you want others to cite your work as?)
• The more specialized a hypothesis is, the more likely it’s new, but a narrow hypothesis has lower impact than a general one, so try to generalize as much as you can to increase impact
• But avoid over-generalizing (must be supported by your experiments)
• Tuning hypotheses
![Page 45: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/45.jpg)
2008 © ChengXiang Zhai 45
Procedure of Hypothesis Testing
• Clearly define the hypothesis to be tested (include any necessary conditions)
• Design the right experiments to test it (experiments must match the hypothesis in all aspects)
• Carefully analyze results (seek for understanding and explanation rather than just description)
• Unless you’ve got a complete understanding of everything, always attempts to formulate a further hypothesis to achieve better understanding
![Page 46: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/46.jpg)
2008 © ChengXiang Zhai 46
Clearly Define a Hypothesis
• A clearly defined hypothesis helps you choose the right data and right measures
• Make sure to include any necessary conditions so that you don’t over claim
• Be clear about any justification for your hypothesis (testing a random hypothesis requires more data than testing a well-justified hypothesis)
![Page 47: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/47.jpg)
2008 © ChengXiang Zhai 47
Design the Right Experiments• Flawed experiment design is a common cause of rejection of
an IR paper (e.g., a poorly chosen baseline)
• The data should match the hypothesis – A general claim like “method A is better than B” would need a
variety of representative data sets to prove
• The measure should match the hypothesis– Multiple measures are often needed (e.g., both precision and
recall)
• The experiment procedure shouldn’t be biased – Comparing A with B requires using identical procedure for both
– Common mistake: baseline method not tuned or not tuned seriously
• Test multiple hypotheses simultaneously if possible (for the sake of efficiency)
![Page 48: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/48.jpg)
2008 © ChengXiang Zhai 48
Carefully Analyze the Results
• Do the significance test if possible/meaningful
• Go beyond just getting a yes/no answer
– If positive: seek for evidence to support your original justification of the hypothesis.
– If negative: look into reasons to understand how your hypothesis should be modified
– In general, seek for explanations of everything!
• Get as much as possible out of the results of one experiment before jumping to run another
– Don’t throw away negative data
– Try to think of alternative ways of looking at data
![Page 49: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/49.jpg)
2008 © ChengXiang Zhai 49
Modify a Hypothesis
• Don’t stop at the current hypothesis; try to generate a modified hypothesis to further discover new knowledge
• If your hypothesis is supported, think about the possibility of further generalizing the hypothesis and test the new hypothesis
• If your hypothesis isn’t supported, think about how to narrow it down to some special cases to see if it can be supported in a weaker form
![Page 50: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/50.jpg)
2008 © ChengXiang Zhai 50
Derive New Hypotheses
• After you finish testing some hypotheses and reaching conclusions, try to see if you can derive interesting new hypotheses
– Your data may suggest an additional (sometimes unrelated) hypothesis; you get a by-product
– A new hypothesis can also logically follow a current hypothesis or help further support a current hypothesis
• New hypotheses may help find causes:
– If the cause is X, then H1 must be true, so we test H1
![Page 52: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/52.jpg)
2008 © ChengXiang Zhai 52
When to Write a Paper? • Survey/Review paper:
– An emerging field or topic has appeared (i.e., a hot topic) but no survey is available, or sufficient new development has occurred such that existing surveys are out of date
– You’ve read and digested enough papers about the topic
• Original research paper: when you have sufficient results to draw an interesting conclusion or answer an interesting research question, i.e., you’ve got a basic story to tell, e.g.,– A new problem, a solution, and results showing how good the
solution is
– An old problem, a new solution, and results showing advantage(s) of the new solution over the old ones
– An old problem, many old solutions, and results showing an understanding of their relative performance
– In general, a research question and an answer
![Page 53: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/53.jpg)
2008 © ChengXiang Zhai 53
Before you write any paper, be clear about the targeted readers
![Page 54: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/54.jpg)
2008 © ChengXiang Zhai 54
Typical Structure of a Survey Paper
• Introduction:
– Motivation for the survey
• An emerging field/topic, but no survey available
• Surveys exist, but they are out of date (e.g., due to new development in a field/topic)
– Scope of the survey
• Background (if necessary)
• Conceptual framework ( based on synthesis of the literature)
– Define basic concepts, terminology, etc
– Give a big picture of the topic so that your survey is coherent
![Page 55: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/55.jpg)
2008 © ChengXiang Zhai 55
Typical Structure of a Survey Paper (cont.)
• Systematic review of existing work – It’s very important that you have some clear structure for this part
• The structure is usually your conceptual framework, or
• other meaningful structures (e.g., by time or some way to classify all the work)
– Be critical! Add your opinions about the work surveyed
– Don’t treat every work equally; elaborate on some representative work and simply give pointers to other work
• Summary– Summarize the progress and the state of the art
– Give recommendations if any (e.g., for practitioners)
– Outlook (remaining challenges, future directions)
• References
![Page 56: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/56.jpg)
2008 © ChengXiang Zhai 56
Typical Structure of a Research Paper
• 1. Introduction– Background discussion to motivate your problem
– Define your problem
– Argue why it’s important to solve the problem
– Identify knowledge gap in existing work or point out deficiency of existing answers/solutions
– Summarize your contributions
– Briefly mention potential impact
• Tips: – Start with sentences understandable to almost everyone
– Tell the story at a high-level so that the entire introduction is understandable to people with no/little technical background in the topic
– Use examples if possible
![Page 57: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/57.jpg)
2008 © ChengXiang Zhai 57
Typical Structure of a Research Paper (cont.)
• 2. Previous/Related work
– Sometimes this part is included in the introduction or appears later
– Previous work = work that you extend (readers must be familiar with it to understand your contribution)
– Related work = work related to your work (readers can until later in the paper to know about it)
• Tips:
– Make sure not to miss important related work
– Always safer to include more related work
– Discuss the existing work and its connection to your work
• Your work extends …
• Your work is similar to … but differs in that …
• Your work represents an alternative way of …
– Whenever possible, explicitly discuss your contribution in the context of existing work
![Page 58: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/58.jpg)
2008 © ChengXiang Zhai 58
Typical Structure of a Research Paper (cont.)
• 3. Problem definition/formulation
– Clearly define your problem
• If it’s a new problem, discuss its relation to existing related problems
• If it’s an old problem, cite the previous work
– Justify why you define the problem in this way
– Discuss challenges in solving the problem
• Tips:
– Give both an informal description and a formal description if possible
– Make sure that you mention any assumption you make when defining the problem (e.g., your focus may be on studying the problem in certain conditions)
![Page 59: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/59.jpg)
2008 © ChengXiang Zhai 59
Typical Structure of a Research Paper(cont.)
• 4. Overview of the solution(s) (can be merged with the next part)– Give a high-level information description of the proposed
solutions or solutions you study
– Use examples if possible
• 5. Specific components of your solution(s)– Be precise (formal description helps)
– Use intuitive descriptions to help people understand it
• Tips: – make sure that you organize this part so that it’s understandable
to people with various backgrounds
– Don’t just throw in formulas; include high-level intuitive descriptions whenever possible
![Page 60: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/60.jpg)
2008 © ChengXiang Zhai 60
Typical Structure of a Research Paper(cont.)
• 6. Experiment design: make sure you justify it
– Data set
– Measures
– Experiment procedure
• Tips:
– Given enough details so that people can reproduce your experiments
– Discuss limitation/bias if any, and discuss its potential influence on your study
![Page 61: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/61.jpg)
2008 © ChengXiang Zhai 61
Typical Structure of a Research Paper(cont.)
• 7. Result analysis: – Organized based on research questions to be answered or hypotheses
tested
– Be comprehensive, but focus on the major conclusions
– Include “standard” components• Baseline comparison
• Individual component analysis
• Parameter sensitivity analysis
• Individual query analysis
• Significance test
– Discuss the influence of any bias or limitation
• Tips– Don’t leave any question unanswered (try to provide an explanation for
all the observed results)
– Discuss your findings in the context of existing work if possible • Similar observations have also been made in …
• This is in contrast to … observed in … One explanation is ….
![Page 62: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/62.jpg)
2008 © ChengXiang Zhai 62
Typical Structure of a Research Paper(cont.)
• 8. Conclusions and future work
– Summarize your contributions
– Discuss its potential impact
– Discuss its limitation and point out directions for future work
• 9. References
![Page 63: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/63.jpg)
2008 © ChengXiang Zhai 63
Tips on Polishing your Paper
• Start with the core messages you want to convey in the paper and expand your paper by following the core story
• Try to convey the core messages at different levels so that people with different knowledge background can all get them
• Try to write a review of your paper yourself, commenting on its originality, technical soundness, significance, evaluation, etc, and then revise the paper if needed
• Check out reviewer’s instructions, e.g., the following: http://nips07.stanford.edu/nips07reviewers.html (not necessarily matching your conference, but should share a lot of common requirements)
• Try to polish English as much as you can
![Page 64: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/64.jpg)
2008 © ChengXiang Zhai 64
What an IR reviewer often looks for• Most important factors:
– Realistic setup of a retrieval problem
• What kind of users would benefit from your research?
– Solid evaluation of methods
• Truly state of the art baseline
• Careful selection of data sets
– Use as many representative data sets as possible
– Always use a standard data set (e.g., TREC) if possible
• Careful definition of measures
• Unbiased experiment procedure
• General factors:
– Quality of argument, novelty, writing, …
– Avoid all kinds of careless mistakes! (If you aren’t careful about writing, it’s possible you aren’t careful about your experiments either.)
![Page 65: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/65.jpg)
2008 © ChengXiang Zhai 65
Where to Publish IR Papers• Core IR conferences:
– ACM SIGIR, ACM CIKM
– ECIR, AIRS
• Core IR journals– ACM TOIS, IRJ
– IPM, JASIS
• Web Applications– WWW, WSDM
• Other related conferences– Natural Language Processing: HLT, ACL, NAACL, COLING, EMNLP
– Machine Learning: ICML, NIPS
– Data Mining: KDD, ICDM
– Databases: SIGMOD, VLDB, ICDE
• …
![Page 66: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/66.jpg)
2008 © ChengXiang Zhai 66
After You Get Reviews Back• Carefully classify comments into:
– Unreasonable comments (e.g., misunderstanding):• Try to improve the clarity of your writing
– Reasonable comments • Constructive: easy to implement
• Non-constructive: think about it, either argue the other way or mention weakness of your work in the paper
• If paper is accepted– Take the last chance to polish the paper as much as you can
– You’ll regret if later you discover an inaccurate statement or a typo in your published paper
• If paper is rejected– Digest comments and try to improve the research work and the paper
– Run more experiments if necessary
– Don’t try to please reviewers (the next reviewer might say something opposite); instead use your own judgments and use their comments to help improve your judgments
– Reposition the paper if necessary (again, don’t reposition it just because a reviewer rejected your original positioning)
![Page 67: 2008 © ChengXiang Zhai 1 Introduction to Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign czhai,](https://reader031.vdocuments.mx/reader031/viewer/2022032709/56649eab5503460f94bb14b8/html5/thumbnails/67.jpg)
2008 © ChengXiang Zhai 67
Summary • Research is about discovery and increase our knowledge
(innovation & understanding)
• Intellectual curiosity and critical thinking are extremely important
• Work on important problems that you are passionate about
• Aim at becoming a top expert on one topic area
– Obtain complete knowledge about the literature on the topic (read all the important papers and monitor the progress)
– Write a survey if appropriate
– Publish one or more high-quality papers on the topic
• Don’t give up!
• Good luck!