TiredOldCrow

TiredOldCrow t1_ivpr1wm wrote

For reference, here's a link to the BigScience RAIL License.

The license includes usage restrictions which specifically forbid illegal and harmful uses (many of which are ostensibly already illegal under the law), as well as specific other uses that the model authors are uneasy about (e.g., medical advice, law enforcement, and automated decision making).

Researchers are effectively attempting to morally (and perhaps legally) exonerate themselves from abusive usage of technology they have developed. With this, they also hope to retain the right to exceptionally approve or deny usage in sensitive areas.

Personally, I'm sympathetic to the dilemma faced by researchers, given the large potential for abuse of these models, and the relative lack of regulation of AI systems in some jurisdictions. That said, I believe that hard legislation is ideally where usage restrictions would be enforced, not software licensing. Existing laws and policies, such as Article 22 of the GDPR, or Canada's TBS Directive on Automated Decision Making, should serve as a template.

7

TiredOldCrow t1_iv8tqar wrote

I know it's naive to expect machine learning to imitate life too closely, but for animals, "models" that are successful enough to produce offspring pass on elements of those "weights" to their children through nature+nurture.

The idea of weighting more successful previous models more heavily when "reincarnating" future models, and potentially borrowing some concepts from genetic algorithms with respect to combining multiple successful models seems interesting to me.

60

TiredOldCrow t1_iuzp1y3 wrote

I appreciate that the legendary "fast inverse square root" code from Quake 3 gets produced verbatim, comments and all, if you start with "float Q_rsqrt".

float Q_rsqrt( float number )
{
	long i;
	float x2, y;
	const float threehalfs = 1.5F;

	x2 = number * 0.5F;
	y  = number;
	i  = * ( long * ) &y;                       // evil floating point bit level hacking
	i  = 0x5f3759df - ( i >> 1 );               // what the fuck? 
	y  = * ( float * ) &i;
	y  = y * ( threehalfs - ( x2 * y * y ) );   // 1st iteration
//	y  = y * ( threehalfs - ( x2 * y * y ) );   // 2nd iteration, this can be removed

	return y;
}

I'm interested in how practical it will be for a motivated attacker to poison a code generation models with vulnerable code. Also curious to what extent these models produce code that only works with outdated and vulnerable dependencies -- a problem you'll also run into if you naively copy old StackOverflow posts. I've recently been working on threat models in natural language generation, but it seems like threat models in code generation are also going to be interesting.

Edit: Not John Carmack!

22

TiredOldCrow t1_its89ot wrote

Since you're using different pre-trained VGG16 models as a starting point, you may just be demonstrating that the PyTorch torchvision model is more amenable to your combination of hyperparameters than the TensorFlow one.

Ideally for this kind of comparison you'd use the exact same pretrained model architecture+weights as a starting point. Maybe look for a set of weights that has been ported to both PyTorch and TensorFlow?

41

TiredOldCrow t1_it7kkg2 wrote

Sympathetic to the complaint, but don't know if you've linked a good example here.

I'll go to bat for Jason Brownlee. He's provided some really excellent hands-on tutorials over the years, repeatedly updates his blogs based on feedback, and overall has made the field much more accessible.

27

TiredOldCrow OP t1_isvei0e wrote

My own thinking is that large venues might implement automated filtering using detection models. Detection of machine generated text increases with sequence length, so papers with large amounts of generated text stand reasonable odds of detection.

That said, the results of this detection would likely need to be scrutinized by a reviewer anyways (especially if conferences don’t ban AI writing assistants altogether).

3