[] Pretty flower.


* Introduction

  Misprint: Google, not CollabNet.

  This talk was originally called "Technologies of Cooperation".  Then
  they suggested I change it to "Ten Tools Developers Need".  Sure, I
  said.  But it's still the same talk.

  Having the right tools can make more difference in a project than
  just about anything else.

[] GOOD TOOLS

     - "We need more content, let's try to get more people writing
        HTML pages" vs just installing a Wiki.

     - anonymous read-only CVS.  Make sure to ask people to remember
       back to the days before it.  

       Let this lead into "accidental invention" factor, which then
       leads into how we cannot predict how many of the ideas in this
       talk are really worth anything.

  * The art class metaphor.

  * This talk will discuss some existing tools, and propose some new
    ones, *but* it's not really about tools at all.  It is about our
    attitude toward tools and toward tool-assisted collaboration.  My
    desire is to ignite (or reignite, as the case may be) your
    frustration, to make you feel dissatisfied and unhappy when you
    work with your tools

[] SAD FACE

    So here's a prime example...


* Better spam control techniques.

[] SPAM SUCKS.

  (Is there anyone here who disagrees?)

Talk about the spam problem, and how everyone lives in a different
spam universe.

   * My dad gets one to five spams a day, it seems like.

   * I've been on the 'Net for 15+ years, and I live in a
     completely different spam universe from him.  Give stats.

   * Drupal comment spam.

   * captcha's are not going to magically make this problem go away
     (they're computable, and anyway accessibility concerns).

I don't see any way to avoid the problem.  And humans are going to be
involved in any solution, until we have something that passes the
Turing Test.

[] SPAM CONTROL TECHNIQUES SUCK TOO.

Describe how spam moderation works today...

...then how it *could* work.

Justin's point about identifying real people -- this solves that,
because the people check the people too!

By the way, this also gives me a chance to sneak in an important point
about good tools.  Whenever possible, they don't just increase the
group's efficiency, they also contribute to everyone's sense of being
in a group, with some kind of shared values, or shared enemies -- that
sense of being in a room together working with like-minded people.  In
some ways that benefit can be more important than mere efficiency.

The next tool we'll discuss is another example of that property.  This
one actually exists, too...


* Contribulyzer

[] THE CONTRIBULYZER

Explain why the Contribulyzer was needed.

[] GUIDELINES FOR WRITING CONTRIBULYZER-FORMATTED LOG MESSAGES

[] EXAMPLE OF CONTRIBULYZER-FORMATTED LOG MESSAGE

[] MAIN CONTRIBULYZER SUMMARY PAGE

[] KAMESH'S SUMMARY PAGE

Note how again, this is a combination of tools and process.
Point out that once a pattern is adhered to 80% of the time, it will
be self-reinforcing.  Somewhere between 60% and 80% is the breakdown
region; below 60%, you really can't count on people noticing it.

                  It's a scientific fact :-).

So, we can see how the Contribulyzer assists with one narrow aspect of
keeping track of who's doing what.  But it's limited to repository
activity.  Expanding on the theme, let's make it be not just be about
commits (open source projects are already too focused on their
developers and not enough on their support networks, so expanding
beyond the committers and committer proxies is the right direction to
go in general), but about participation in general.  And let's try to
make it bidirectional.


* IRC section.

[] IRC IS DATA RICH

[] SAD FACE

[] AFFERO ... USERS DON'T TAKE ACTION

[] THAT'S WHY THEY'RE CALLED USERS

[] SYSTEMS THAT DEPEND ON USERS ... SYSTEMS THAT OFFER USERS

   (but explain why this does not apply to the Contribulyzer syntax)

So, I don't think in the long run we can depend on people telling us
what they know.  But we can harvest their conversations.

What does a typical irc conversation look like?  Well, something like
this:

[] TYPICAL IRC CONVERSATION

[] SEARCH:// syntax
  *** digression: "search://" syntax ***
  *** answers are not static anymore, they are dynamic ***
  *** (remember Netscape bookmarks file?  Who still has those?) ***

Anyway, back to IRC conversations

[] TYPICAL IRC CONVERSATION AGAIN

How is it rich in meta-data?  Well:

[] 1. INTERLOCUTORS ADDRESS EACH OTHER BY NAME...

[] 2. THEY RESPOND TO EACH OTHER IN REAL-TIME...

What can we get from this?  What could we enable if we kept track of
this information and tried to parse it in some rudimentary way?
(handwave "natural language processing" handwave)

[] AUTOMATIC ANSWERS

   1. Obviously bots could start accumulating a knowledge base of
      questions and answers.  Apparently, some bots try this already,
      with limited success.  I'm not personally familiar with them,
      and I don't know if they're using timing and
      interlocutor-addressing.

[] WHO HELPS WHOM, WHO KNOWS WHAT

   2. We can start keeping track of who helps whom, and who knows
      what.  And you can look for words like "it's working", "fixed",
      "thanks for your help", etc to know who's successfully helping
      -- and who's failing.  (THE FAILURES ARE IMPORTANT, we'll come
      back to them later).

      Okay, that's kind of nice, but...
      
[] AUTOMATED FAQ MAINTENANCE

   3. Automated FAQ maintenance

      The trick is *not* to over-automate!

[] DON'T AUTOMATE, SUGGEST

   <individual brain as lens for collective knowledge>

Remember the failures, pull those people back when possible.

[] AND ONCE KNOWLEDGE HAS BEEN CAPTURED, REACH BACK...

  Why?

  Optional: give AT&T example, prefacing with explanation that it
  might seem strange, but that it's a terrific example of the link
  between good processes, good tools, and human engineering.
  Talk about result of phone call, ever-increasing body of knowledge
  combined with lack of context in which to absorb that knowledge,
  long tail effect,

[] BRAIN SLIDE 1.
[] BRAIN SLIDE 2.
[] BRAIN SLIDE 3.


* Mailing list summary system.

[] COMPLEX TREE UNIVERSE

  explain the situation

[] SIMPLIFIED TREE UNIVERSE

  Mailing list traffic summaries are too hard to write; the right
  tools can make them easier.  Relationship between list and summary
  should be bidirectional.

  Digression into mail messages not having archive link as part of the
  header.


* Multi-user commits.

[] PRETTY FLOWER

  Why don't any systems support multi-user commits??
  When I've brought this up before, sometimes people say "Well, you
  want a single point of responsibility."  That's bogus -- that's
  confusing a tool limitation with a feature.  If we actually *had*
  multi-user commits, we'd love it!


* These next ones in the "law as technology" group: 

8. Copyright assignment law is currently designed for a world in which
   people are physically present to sign forms.  It's a problem for
   open source projects, and needs to be solved.  If posting were
   enough, things would be *so* much easier.

9. General group governance question: the law is designed around
   physical meetings.  Yet decision procedures in mailing-list-based
   groups are unambiguous and just as decidable.  If the law would
   recognize this, developer groups would have a much easier time
   formalizing their associations.  I mean, we have voting software
   already, we just aren't allowed to use it.


*  "urljump"

[] URLJUMP

   Deep-referencing into HTML pages is possible, but there is
   currently no standard for it.  If we had a standard and enough
   browsers (or web servers implemented it), certain kinds of
   references would be much easier to make.


* Extended patch format, globally unique IDs.

  Extended patch format for trading traceable changes between
  different projects that have a common ancestry or that share some
  code.  Globally unique IDs for code authors, so that mailing list
  posts, patches, commits, IRC conversations, everything can be
  traced.


-----------------------------------------------------------------------------
             10 [Collaboration] Tools Developers Need Today

(Original title: "Technologies of Cooperation".)

The goal of this talk is to get the audience thinking about
collaboration itself as first-order activity, and one that is
extremely responsive to good tools.  The talk will start off with a
discussion of collaboration tools that have made a big difference
(e.g., Wikis, the CIA commit watching system).  Then it will explore
some new ideas for tools and techniques that are not widespread yet,
but that could really help open source projects a lot:

2. Extended patch format for trading traceable changes between
   different projects that have a common ancestry or that share some
   code.

3. How to improve mailing list usage by better integration with the
   list archiver.  Relationship between message and archive should be
   BIDIRECTIONAL.

4. Mailing list traffic summaries are too hard to write; the right
   tools can make them easier.  Relationship between list and summary
   should be BIDIRECTIONAL.

5. Spam filtering techniques that could really reduce the moderation
   burden on OSS projects.

6. Inter-project attribution conventions would make it *much* easier
   to trace where a given programmer has been active.  The Subversion
   project has started with a rudimentary form of this, but it needs a
   standardization drive.

7. "urljump": Deep-referencing into HTML pages is possible, but there
   is currently no standard for it.  If we had a standard and enough
   browsers (or web servers implemented it), certain kinds of
   references would be much easier to make.

These next ones in the "law as technology" group: 

8. Copyright assignment law is currently designed for a world in which
   people are physically present to sign forms.  It's a problem for
   open source projects, and needs to be solved.

9. General group governance question: the law is designed around
   physical meetings.  Yet decision procedures in mailing-list-based
   groups are unambiguous and just as decidable.  If the law would
   recognize this, developer groups would have a much easier time
   formalizing their associations.

(Depending on how much time the above fills when fully prepared, I may
add or take away some items.)

More ideas:

HelpNet.

Simple IRC improvement: ability to watch all channels for regexps, so
you can jump in where needed & have context.