Closed Bug 1386629 Opened 8 years ago Closed 7 years ago

Perform analysis of recommendation strategies for legacy addon replacement recommendation

Categories

(Data Platform and Tools :: General, enhancement, P2)

enhancement

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: mlopatka, Assigned: mlopatka)

References

Details

No description provided.
Assignee: nobody → mlopatka
Using features from AMO database exported to a local JSON blob: Similarity is computed between vectors containing: ['guid', 'legacy', 'ratings', 'installs', 'languages', 'summary', 'tags', 'title', 'categories'] New addons recommended based on similarity scoring per variable modified hamming distance for: 'languages', 'tags', 'categories' cosine similarity in TF/IDF space for: 'summary', 'title' Similarities comboned by weighted mean. Current dump (August 3, 2017) shows counts as: 16247 legacy addons, 3520 web extensions prototype is available here: https://gist.github.com/mlopatka/ac2f98b33229ec126f2c8930ffb9f126#file-gen_legacy_addon_substitution_suggestions-py
Current recommendations are unsatisfactory. Weight vector for combining similarity scores must be optimized against human feedback. I plan to implement a (less greedy) hill-climber via the simulated annealing algoirthm: https://en.wikipedia.org/wiki/Simulated_annealing With aggressive reinforcement I think I can perform a semi-supervised training for the weight vector.
Language based features are now refined using better stripping and cleaning of the reference vocabulary for TF/IDF. Recommendations are still not satisfactory. I've introduced a dummy variable to weight the features when computing scalar similarity for ranking, this will be varied by the simulated annealing algorithm in the future. repo: https://github.com/mozilla/addon_recommender_driving_analyses/tree/master/legacy_swap
@Dexter is it possible (very difficult?) to get the 'category' field from the AMO database included in this featurespace? The TFIDF model suffers when comparing summaries/descriptions of a very different length. So, I can go the direction of doing more aggressive preprocessing on the text (probably pretty expensive) or introduce an additional text feature with more bounded vocabulary (i.e. categories).
Flags: needinfo?(aplacitelli)
(In reply to mlopatka from comment #4) > @Dexter is it possible (very difficult?) to get the 'category' field from > the AMO database included in this featurespace? > The TFIDF model suffers when comparing summaries/descriptions of a very > different length. So, I can go the direction of doing more aggressive > preprocessing on the text (probably pretty expensive) or introduce an > additional text feature with more bounded vocabulary (i.e. categories). Yes, it is possible to get the category field from AMO. I sent you an email with the full dump
Flags: needinfo?(aplacitelli)
(In reply to mlopatka from comment #4) > @Dexter is it possible (very difficult?) to get the 'category' field from > the AMO database included in this featurespace? > The TFIDF model suffers when comparing summaries/descriptions of a very > different length. So, I can go the direction of doing more aggressive > preprocessing on the text (probably pretty expensive) or introduce an > additional text feature with more bounded vocabulary (i.e. categories). After discussing this with Martin, he really meant 'permission', not 'category' (which was already provided). The problem with the addon permissions is that they are only available for webextension addons, which makes this hardly useful for recommendations. We synced up over IRC for this :)
Simulated annealing approach implemented. I'll begin collecting some labels for the parameter weights. https://github.com/mozilla/addon_recommender_driving_analyses/tree/master/legacy_swap
Simmulated annealing run to get new parameter weights using a *very* limited number of cycles. new featire weights hard-coded on line 27 of addon_recommender_driving_analyses.py In addition to some tweaks with the language processing similarity over free-text features, recommendations seem to have improved a bit. code audit request:Dexter https://github.com/mozilla/addon_recommender_driving_analyses/tree/master/legacy_swap Perhaps (time permitting) it would be worthwhile to get some people to run a few cycles of the simulated annealing and aggregate weight vector data compared to manual recommendation scoring?
Flags: needinfo?(alessio.placitelli)
We had a conversation with Martin while going through the design of the simulated annealing model: we identified a few areas of improvement (he'll comment/get to that later) and documented some "use cases" for the legacy recommender. Use cases: - a legacy addon was disabled, and the same addon is already available using the webextension technology; - a legacy addon was disabled, the same addon is not available using webextension but a similar addon (by category, tags, description) is available using webextensions (exactly same functionalities); - a legacy addon was disabled, but no match is available; other webextension addons with comparable (but not the same) features are available and can be recommended; - a very rare legacy addon was disabled; there's no comparable, similar or related addon implemented using webextensions. We should keep these use cases in mind when reasoning about the recommender and evaluating it.
Flags: needinfo?(alessio.placitelli)
Martin, is this actively being worked on? Do you think you will work on this in Q4?
Flags: needinfo?(mlopatka)
Yes, we currently use a curated list provided by the AMO folks, but that is also being used to train a more automated method. Legacy-based recommendations are only going to be useful in the Q4 and drop off into Q1/Q2 2018. So this is still on the Q4 agenda.
Flags: needinfo?(mlopatka)
As requested here's some thoughts on performance which could be interesting to investigate if we're looking to move this forward to a wider audience in due course. Currently the page (in disco pane) and the recommendations are keyed by the clientId. This means that from a caching perspective we can only cache per-user, for the disco pane if the cache of the page is only fresh for 1 hour, every repeat visit after then generally has a stale cache. Which means the entire chain of API calls is made all the way to TAAR and back. It would be interesting to know across the study if there are clusters of users that get the same list of recommendations. If we found that the clusters were large enough in size we could look at alternative ways to key our requests so that we can cache the content for more users and this may apply to both AMO's requests and the TAAR engine itself. Being able to do more caching could potentially make a big difference to the overall peformance of the service and help with handling the load. Other factors to note: For the disco pane the other variables in the URL are locale (browser UI locale not accept-language), firefox version, OS platform (e.g. Darwin) and compatibility mode (I'm not quite sure what the last one represents it's often set to "normal"). E.g: https://discovery.addons.mozilla.org/%LOCALE%/firefox/discovery/pane/%VERSION%/%OS%/%COMPATIBILITY_MODE% From AMO's perspective we would need to know what size a cluster of recommendations would be taking into account those parameters to be able to improve on the whole page caching we're currently doing. Of course AMO is only one part of this, increased caching at any level will likely improve the overall performance.
Priority: -- → P2
Changed to new component, per bug 1425844.
Component: General → Add-on Recommender

no more legacy addons.

Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
Component: Add-on Recommender → General
You need to log in before you can comment on or make changes to this bug.