academia | advice | alcohol | American Indians | architecture | art | artificial intelligence | Barnard | best | biography | bitcoin | blogging | broken umbrellas | candide | censorship | children's books | Columbia | comics | consciousness | cooking | crime | criticism | dance | data analysis | design | dishonesty | economics | education | energy | epistemology | error correction | essays | family | fashion | finance | food | foreign policy | futurism | games | gender | Georgia | health | history | inspiration | intellectual property | Israel | journalism | Judaism | labor | language | law | leadership | letters | literature | management | marketing | memoir | movies | music | mystery | mythology | New Mexico | New York | parenting | philosophy | photography | podcast | poetry | politics | prediction | product | productivity | programming | psychology | public transportation | publishing | puzzles | race | reading | recommendation | religion | reputation | review | RSI | Russia | sci-fi | science | sex | short stories | social justice | social media | sports | startups | statistics | teaching | technology | Texas | theater | translation | travel | trivia | tv | typography | unreliable narrators | video | video games | violence | war | weather | wordplay | writing

Tuesday, May 22, 2018

Collecting organizational knowledge: a hybrid approach

Stack Overflow recently announced Stack Overflow for Teams. I'm hopeful that it will be what you wish your company wiki (that nobody uses) was.

I'm a big believer in starting with tools that people actually use, and it just makes so much more sense for a project manager to post a question on Stack Overflow than to conventionally ask the team, "Can someone write up how the new AWS deployment works?" Somehow, answering the query through Stack Overflow's interface and conventions makes it seem so much easier to me as an engineer; maybe it's that the format puts direct information ahead of fancy or comprehensive documentation?

This ongoing question of how to collect an organization's knowledge is endlessly fascinating to me. I think most organizations should employ a librarian specifically to do this, by interviewing people verbally (it so much easier to answer verbal questions than written ones), following discussions, and creating an internal newsletter. (Compass, the tech-boosted real estate brokerage, has a full-time internal journalist who does this.)

Discussing S.O.F.T. with friends, we were wondering how close this sort of information collection is to being able to be done implicitly, and on the fly, rather than explicitly and deliberately, in advance of the time that knowledge is requested.

For instance, Slack famously claims in their promotional interviews and writeups that explicit documentation should be a thing of the past, because Slack can use search to source relevant comments and explanations. In practice, I think it's widely agreed that this doesn't work well.

One friend wondered if machine learning was close to being able to collect documentation, either just in time or by preemptively collecting it into reports. I think there's too much complex context to do that well using a pure machine learning approach, at least for the near future.

But what I do think is possible, if not now than at least very soon, is a hybrid approach where machine learning augments deliberate human work, by sourcing suggestions that prompt humans to pick up the ball and run with it. An AI-curated list of likely useful links and excerpts wouldn't replace human editing and summarization, because it wouldn't try to; it would just make that process shorter, most of the time.

For instance, take a slightly different domain--AI-assisted scheduling, which is far behind where glitzy startups like would like you to believe. While I don't think machine learning is good enough yet to handle scheduling communications by itself, I do think it would be reasonable to train a model to look at your email history and calendars for events that appear to be calls or in-person meetings, and to make a Gmail widget that suggests times accordingly, but also lets you just pick them explicitly. That approach would make the system's worst errors inconsequential, while letting it save you time in the typical case. It's the sort of thing that I think people might actually use, without feeling like they are struggling against the machine.

Getting more ambitious, what about a librarian service that has plugins for Slack, Gmail, Asana, etc., and steadily brings notable items to your attention? First and foremost, it could make it very easy to flag snippets of conversation or changes of product requirements as having long-term informational value. Second, it could respond to your choices by helping you navigate among possible related items, by shifting the thresholds for showing items as you accept and reject suggestions. Someone in a project management role who doesn't have a deep technical background might still be able to produce helpful collections of info, even if they require someone with deeper technical knowledge to apply context and tie the information together.

Then, when, say, someone on a Slack conversation asks if we are still using X, that could get flagged by the system and reasonably turn into a note on the document that details the active stack for that project. There would still be false positives and false negatives, but the incidence of both would be reduced by bringing in humans in the middle.

At least for the immediate future, this sort of thing is much more about sophisticated human UX design then it is about advanced ML; and I suspect that will be true in many domains for a long time.

Labels: , , ,