[] Pretty flower. * Introduction Misprint: Google, not CollabNet. This talk was originally called "Technologies of Cooperation". Then they suggested I change it to "Ten Tools Developers Need". Sure, I said. But it's still the same talk. Having the right tools can make more difference in a project than just about anything else. [] GOOD TOOLS - "We need more content, let's try to get more people writing HTML pages" vs just installing a Wiki. - anonymous read-only CVS. Make sure to ask people to remember back to the days before it. Let this lead into "accidental invention" factor, which then leads into how we cannot predict how many of the ideas in this talk are really worth anything. * The art class metaphor. * This talk will discuss some existing tools, and propose some new ones, *but* it's not really about tools at all. It is about our attitude toward tools and toward tool-assisted collaboration. My desire is to ignite (or reignite, as the case may be) your frustration, to make you feel dissatisfied and unhappy when you work with your tools [] SAD FACE So here's a prime example... * Better spam control techniques. [] SPAM SUCKS. (Is there anyone here who disagrees?) Talk about the spam problem, and how everyone lives in a different spam universe. * My dad gets one to five spams a day, it seems like. * I've been on the 'Net for 15+ years, and I live in a completely different spam universe from him. Give stats. * Drupal comment spam. * captcha's are not going to magically make this problem go away (they're computable, and anyway accessibility concerns). I don't see any way to avoid the problem. And humans are going to be involved in any solution, until we have something that passes the Turing Test. [] SPAM CONTROL TECHNIQUES SUCK TOO. Describe how spam moderation works today... ...then how it *could* work. Justin's point about identifying real people -- this solves that, because the people check the people too! By the way, this also gives me a chance to sneak in an important point about good tools. Whenever possible, they don't just increase the group's efficiency, they also contribute to everyone's sense of being in a group, with some kind of shared values, or shared enemies -- that sense of being in a room together working with like-minded people. In some ways that benefit can be more important than mere efficiency. The next tool we'll discuss is another example of that property. This one actually exists, too... * Contribulyzer [] THE CONTRIBULYZER Explain why the Contribulyzer was needed. [] GUIDELINES FOR WRITING CONTRIBULYZER-FORMATTED LOG MESSAGES [] EXAMPLE OF CONTRIBULYZER-FORMATTED LOG MESSAGE [] MAIN CONTRIBULYZER SUMMARY PAGE [] KAMESH'S SUMMARY PAGE Note how again, this is a combination of tools and process. Point out that once a pattern is adhered to 80% of the time, it will be self-reinforcing. Somewhere between 60% and 80% is the breakdown region; below 60%, you really can't count on people noticing it. It's a scientific fact :-). So, we can see how the Contribulyzer assists with one narrow aspect of keeping track of who's doing what. But it's limited to repository activity. Expanding on the theme, let's make it be not just be about commits (open source projects are already too focused on their developers and not enough on their support networks, so expanding beyond the committers and committer proxies is the right direction to go in general), but about participation in general. And let's try to make it bidirectional. * IRC section. [] IRC IS DATA RICH [] SAD FACE [] AFFERO ... USERS DON'T TAKE ACTION [] THAT'S WHY THEY'RE CALLED USERS [] SYSTEMS THAT DEPEND ON USERS ... SYSTEMS THAT OFFER USERS (but explain why this does not apply to the Contribulyzer syntax) So, I don't think in the long run we can depend on people telling us what they know. But we can harvest their conversations. What does a typical irc conversation look like? Well, something like this: [] TYPICAL IRC CONVERSATION [] SEARCH:// syntax *** digression: "search://" syntax *** *** answers are not static anymore, they are dynamic *** *** (remember Netscape bookmarks file? Who still has those?) *** Anyway, back to IRC conversations [] TYPICAL IRC CONVERSATION AGAIN How is it rich in meta-data? Well: [] 1. INTERLOCUTORS ADDRESS EACH OTHER BY NAME... [] 2. THEY RESPOND TO EACH OTHER IN REAL-TIME... What can we get from this? What could we enable if we kept track of this information and tried to parse it in some rudimentary way? (handwave "natural language processing" handwave) [] AUTOMATIC ANSWERS 1. Obviously bots could start accumulating a knowledge base of questions and answers. Apparently, some bots try this already, with limited success. I'm not personally familiar with them, and I don't know if they're using timing and interlocutor-addressing. [] WHO HELPS WHOM, WHO KNOWS WHAT 2. We can start keeping track of who helps whom, and who knows what. And you can look for words like "it's working", "fixed", "thanks for your help", etc to know who's successfully helping -- and who's failing. (THE FAILURES ARE IMPORTANT, we'll come back to them later). Okay, that's kind of nice, but... [] AUTOMATED FAQ MAINTENANCE 3. Automated FAQ maintenance The trick is *not* to over-automate! [] DON'T AUTOMATE, SUGGEST Remember the failures, pull those people back when possible. [] AND ONCE KNOWLEDGE HAS BEEN CAPTURED, REACH BACK... Why? Optional: give AT&T example, prefacing with explanation that it might seem strange, but that it's a terrific example of the link between good processes, good tools, and human engineering. Talk about result of phone call, ever-increasing body of knowledge combined with lack of context in which to absorb that knowledge, long tail effect, [] BRAIN SLIDE 1. [] BRAIN SLIDE 2. [] BRAIN SLIDE 3. * Mailing list summary system. [] COMPLEX TREE UNIVERSE explain the situation [] SIMPLIFIED TREE UNIVERSE Mailing list traffic summaries are too hard to write; the right tools can make them easier. Relationship between list and summary should be bidirectional. Digression into mail messages not having archive link as part of the header. * Multi-user commits. [] PRETTY FLOWER Why don't any systems support multi-user commits?? When I've brought this up before, sometimes people say "Well, you want a single point of responsibility." That's bogus -- that's confusing a tool limitation with a feature. If we actually *had* multi-user commits, we'd love it! * These next ones in the "law as technology" group: 8. Copyright assignment law is currently designed for a world in which people are physically present to sign forms. It's a problem for open source projects, and needs to be solved. If posting were enough, things would be *so* much easier. 9. General group governance question: the law is designed around physical meetings. Yet decision procedures in mailing-list-based groups are unambiguous and just as decidable. If the law would recognize this, developer groups would have a much easier time formalizing their associations. I mean, we have voting software already, we just aren't allowed to use it. * "urljump" [] URLJUMP Deep-referencing into HTML pages is possible, but there is currently no standard for it. If we had a standard and enough browsers (or web servers implemented it), certain kinds of references would be much easier to make. * Extended patch format, globally unique IDs. Extended patch format for trading traceable changes between different projects that have a common ancestry or that share some code. Globally unique IDs for code authors, so that mailing list posts, patches, commits, IRC conversations, everything can be traced. ----------------------------------------------------------------------------- 10 [Collaboration] Tools Developers Need Today (Original title: "Technologies of Cooperation".) The goal of this talk is to get the audience thinking about collaboration itself as first-order activity, and one that is extremely responsive to good tools. The talk will start off with a discussion of collaboration tools that have made a big difference (e.g., Wikis, the CIA commit watching system). Then it will explore some new ideas for tools and techniques that are not widespread yet, but that could really help open source projects a lot: 2. Extended patch format for trading traceable changes between different projects that have a common ancestry or that share some code. 3. How to improve mailing list usage by better integration with the list archiver. Relationship between message and archive should be BIDIRECTIONAL. 4. Mailing list traffic summaries are too hard to write; the right tools can make them easier. Relationship between list and summary should be BIDIRECTIONAL. 5. Spam filtering techniques that could really reduce the moderation burden on OSS projects. 6. Inter-project attribution conventions would make it *much* easier to trace where a given programmer has been active. The Subversion project has started with a rudimentary form of this, but it needs a standardization drive. 7. "urljump": Deep-referencing into HTML pages is possible, but there is currently no standard for it. If we had a standard and enough browsers (or web servers implemented it), certain kinds of references would be much easier to make. These next ones in the "law as technology" group: 8. Copyright assignment law is currently designed for a world in which people are physically present to sign forms. It's a problem for open source projects, and needs to be solved. 9. General group governance question: the law is designed around physical meetings. Yet decision procedures in mailing-list-based groups are unambiguous and just as decidable. If the law would recognize this, developer groups would have a much easier time formalizing their associations. (Depending on how much time the above fills when fully prepared, I may add or take away some items.) More ideas: HelpNet. Simple IRC improvement: ability to watch all channels for regexps, so you can jump in where needed & have context.