Frank Shipman has worked for years on a Interest Profile Manager (IPM) that runs locally on a users’s computer. They model users based on 3 sources: (1) interaction with their knowledge browser, (2) browser history, and (3). They are starting a new project to explore pooling individual models to community models while anonymizing the data. They model users using weighted term vectors clustered by interest. For example, to model Joel, they might identify 5 interests STEM, software development, bicycling, each of which is represented as a term vector. For example, STEM might be represented as:
nlvm^.5 nsdl^.3 math^.7 learners^.2 education^.8 “grand challenges”^.7
The advantage of identifying and modeling interests are that instead of modeling the average of a persons interests, you model specific interests. Then when you match people or documents, you match them with specific interests instead of the average of their interests. You can see the benefit of this when you consider what the average of math and bicycling is? Alex and I have discussed this issue, but have not yet implemented it. I’m interested to see any research that compares the two approaches.
So some of the questions related to modeling a learner using term vectors are:
- How many interests you should model?
- How do you discover the interests?
- How many terms should you include in each term vector?
- How do you do modeling and recommendations efficiently?
In Ed Fox’s presentation we are listening to Peter Brusilovsky talk via Second Life about the kinds of data that they use to model users including: search, tagging, comments, resources they have created including paths (Walden’s Paths), collaborative recommendations, and social navigation (guides users to most active resources). They uses these models to provide recommendation of content and other users. CUMULATE is their group modeling framework. They are tracking of browsing and problem solving. Steve demonstrated a bunch of tools they have in Second Life for Ensemble.
We are addressing similar approaches in Folksemantic. “Systems support personalization by adapting functionality to individuals, by allowing users to customize the system, and by supporting human interaction inside of the system.” We allow people to register their bookmark, blog and other feeds. We also will be using click, search, comment, and share data to model users and provide recommendations.
Posted on November 18th, 2009 by joel
Filed under: conferences
[...] user interests separately – Modeling users interests using multiple term vectors (one for each interest) by extracting vectors from closely related (clustered documents) that users [...]