Instead of maintaining and distributing your open-source project in the dark, take steps to ensure its success. Scarf helps you understand how your code is used and connects you with the right people at the companies that rely on your work.
Would you like to understand your project's impact, identify potential customers, or showcase growth without sacrificing end-user privacy? Scarf helps you understand how users are interacting with your open-source project at every step of their journey with your software.
You had me at "in the dark"!
Anyone else heard of Scarf, or even tried it yet?
Scarf's Documentation Insights provides simple tracking pixels for a better understanding of your project's web traffic. By instrumenting your READMEs, documentation websites, tutorial pages and other web properties of your project, Scarf can tell you which companies are frequently viewing your documentation—providing access to those more likely to pay for training or support.
I want this. But I don't think this would be able to track users reading READMEs on GitHub, due to https://github.blog/2014-01-28-proxying-user-images/.
Also, if this is done "without sacrificing end-user privacy" (as promised), how do they get informed consent from users now that they're being tracked (which is legally required in many jurisdictions)? E.g. If a maintainer adds this to their docs site, will they have to add an annoying cookie banner to their site? Etc.
SDKs for package authors
Scarf's SDKs provide an easy way to understand how your software is being used. By simply adding a dependency on a Scarf library (eg scarf-js), you can start collecting actionable installation analytics that can help you keep your package working smoothly.
Want to know which versions of your package are being used? Did your most recent release break things for your users? Are there companies using your library that would pay you for a support contract? You won't have to write a single line of code to find out!
I want this too, but again, there seem to be some unanswered legal and UI questions about getting users' consent, as well as technical concerns about how this could work for open source libraries that are running in an environment without internet access.
Still, there is certainly a compelling value proposition here and the various questions may have good answers. I can post back here if I find out anything and people are interested, but please feel free to beat me to it and/or share your thoughts in the meantime!
Top comments (7)
👋 Founder of Scarf here, happy to answer some of these questions.
You are correct that no cookie banners are needed with Documentation Insights, as it does not rely on cookies at all. It works entirely on information from HTTP headers and IP address metadata (raw IPs are purged from the system).
That principle of always deleting personally identifiable information is a major factor in keeping Scarf compliant under GDPR and similar kinds of regulations, as well as why maitnainers are not legally required to collect explicit opt-ins while using Scarf. Under GDPR, Scarf is the data processor, acting on behalf of the data controller, the OSS project, which never actually touches any sensitive data.
What we tried with scarf-js was an early attempt to get package authors this valuable data, to help them build better and more sustainable projects. While this particular usage of postInstall hooks was had no malicious intent or effects whatsoever, it was unfortunately (and perhaps not surprisingly) unpopular with end-users in the JS community. We took the feedback to heart and continue to push forward on our ideas however we can, and so Scarf has largely de-prioritized our package SDKs in favor of Scarf Gateway (about.scarf.sh/scarf-gateway). By augmenting the registry layer, Scarf Gateway is able to surface all of the same data without adding any kind of telemetry hooks or even running code on the users' machine at all. scarf-js is still in use by a handful of npm packages so we will continue to support it, but phone-home mechanisms have categorically proven too unpopular with the OSS community, and ultimately Scarf exists to help the community.
Empowering OSS developers with safe and useful analytics has not been traditionally popular, but the reality is that many many maintainers badly need it. We'll continue to listen to what developers tell us and build the best solutions we can to fit everyone's preferences and needs!
Thanks Avi, great to hear more about Scarf, straight from the source!
I decided to start out by trying Documentation Insights (on my bidict.readthedocs.io docs site), and I'm happy to say it's been enlightening already. (E.g. I had no idea that folks at Tesla have been reading my docs, and might be using my project!) Any plans to add a weekly email digest feature with the latest insights, in case I forget to check the Scarf dashboard for a while?
Thanks again for the helpful info you posted, and for offering such a useful service to the open source community. Looking forward to following Scarf updates in the future.
Love that! I'm so glad to hear you're finding it useful.
Currently we do send out a weekly digest of new companies that have surfaced but that's only hooked up to package download metrics; Documentation Insights is not yet included. We'll get that added soon! Really appreciate the feedback, we'll definitely be improving those digest emails.
I found docs.scarf.sh/web-traffic/, which says "Scarf does not store the IP address itself, so no personally identifiable information is collected." I guess this means no cookie banner required?
So that answers my UI question. But then I scrolled down and saw...
...and yet there's no mention that this relies on the security vulnerability reported at blog.npmjs.org/post/141702881055/p... and more recently publicized in youtu.be/24tQRwIRP_w?t=923 (since which many users have disabled npm install scripts, corporate users especially).
If a company using my project want to interact they can do that by using the bug tracker if they want, right ? What would be the maintainer interest in taking the time to gather feedback from users unwilling to take the time to open an issue ? (Instead of fixing something from the backlog of the bug tracker directly)
You can do much more than just gather feedback from companies relying on your project! You can convince them to sponsor you, you can broker a support contract with them, consult, etc. If maintainers can more proactive, they will be more successful, regardless of what their goals are.
Ok, it make sense. Probably something I will not / can't do because I'm a really bad salesman.