AI in the school playground

Buried in Apple’s Developer Conference last month was the release of “a PDF format for AI”. Build a model on one system, and open it identically on another. As PDF did for document sharing, this will do for sharing AI models. The person who uses an AI no longer has to be the person who trained it.

Training is feeding it a load of past data where you know the outcome, and it figures out how to reproduce them. Use is feeding it new data, and it tells you what it thinks the outcome should be.

There are also additional rounds of training when “new” data has become “past” data, with both the outcome it expected and the outcome it received. (One reason the AI outfits use games to develop models is the scoring mechanism gives an instant, simple, and clear metric of success and improvement.)

This approach works easily for numbers, text is harder, but also doable, if you have enough context, or the target is important enough.

There is a twitter account @trumps_feed which is a copy of the twitter feed that Donald reads. There is another feed of what he then does. It is unimaginable that various entities around the world are not feeding both to an AI. It takes very few resources to extend that historically.

That creates you an AI to predict what DJT might say in response to something. Feed it all Donald’s tweets, and you can produce a model of him. It takes a load of processing to build, but once built, can now be freely shared. Given the commercial services that will monitor what particular targets say on twitter, pretty soon, they’ll offer more analysis.

Doing that with facebook / tumblr / instagram / twitter of popular (or otherwise) people, including teenagers, starts to get very creepy very fast. Twitter may be easy, but facebook has emotional colour.  This is also why the AIs looking for intent train a lot better when there’s an emoji for meaning in the training dataset.  

Facebook imply that they already do something like this, as their salesman brag they can tell when users feel something – of course facebook outsource “doing something” with such identifications to the highest bidder. But today’s megacorp unique tool is tomorrow’s app project – everything becomes available to everyone.

Samaritan’s radar shows what happens when institutions get this wrong – but copying of models means it’s no longer just institutions. If someone or their app can read your facebook feed, new tools mean they will be able to make the same inferences facebook claim to do.

The best analogy is simple: Cambridge Analytica’s mindset and tools in the hands of every child in the playground.

Apple would likely prevent such an app on their App Store, but Google Play would let it in with open arms. The AI people would claim that’s not their problem.

posted: 21 Aug 2017

No AI tool that makes logging optional should be expected to operate safely.

If “safety” is something you have to turn on, any tool will be used unsafely. Today, unsafe yet powerful AI tools are made available for anyone, for free – tensorflow, openAI, et al.

Even America has some regulations on handing out free guns, but powerful features are added for commercial reasons, and turned loose for the public. It is sheer folly that free tools will be used with the same level of responsibility by individuals, as by organisations that claim effective checks and balances. The organisations use the same tools internally as they release for others – in a dash for market share and mindshare of developers.

The release of powerful AI tools without a correspondingly available audit or safety infrastructure suggests the claims to prioritise safety are entirely hollow – by definition, AI safety mechanisms are entirely optional, with the burden on every user and not the system.

External logging infrastructures are basically a spreadsheet. Google Docs’ spreadsheet capabilities are the minimal viable necessary for everyone (with a google account), for free. Google gives 15Gb of storage space with every account, which should be good for years of run logs, and even then, google spreadsheets don’t count in that space. The spreadsheet API lets you do this already – and you can implement a blockchain in a spreadsheet to have some level of audit.

If you don’t want to use google tools tensorflow/cloud platform/gDrive, there is also AWS. Amazon will rent you high powered GPUs for cents per hour, and have APIs for everything, include write only logging to S3, but no support for ensuring the minimal audit on your AIs you run there.

For the rugged individualists replicating AlphaGo in your back bedroom, and who don’t want to hand their logs to a megacorp, those tools could allow logging to other infrastructures (Plasma would let you confirm you are logging without saying what; many other storage options exist today).

The current version of the test is: No AI tool that makes logging optional can be expected to operate safely.

The logs can be kept private, and they can be at high level, but they have to be written. What should go in them is something the well funded AI debates continue to have, but as it stands today, the commitment to AI safety is impossible to deliver, because the tools do not require anything be logged. Nothing stops a decision to delete any logs, but deleting logs is a coverup.

This won’t do everything, but currently there’s nothing at all. And nothing at all makes the claims of AI safety sound distinctly hollow.

posted: 16 Aug 2017