LeN3rd t1_jec6pvd wrote on March 30, 2023 at 11:07 PM

Reply to China calls US debt trap accusation 'irresponsible' by BubsyFanboy

FFs. Whenever a Chinese official is peddling this kind of shit, tarifs should go up by 1%, a randomly choosen president from the West visits Taiwan and Disney adds it with country lines on a map in the background in the next movie.

LeN3rd t1_jebn0kt wrote on March 30, 2023 at 8:53 PM

Reply to comment by sakmaidic in The starfish I found! by bubblebuttbella

Are you ok?

LeN3rd t1_je2b1yf wrote on March 28, 2023 at 10:03 PM

Reply to comment by pewpewbrrrrrrt in How does an ideal vacuum have a dielectric breakdown voltage of 10^12 MV/m? If there is nothing there, then how can electricity pass through it? by skovalen

Ideal vacuum means there are no atoms/particles. Fields by definition are always there in all of space.

LeN3rd t1_jdls5jy wrote on March 25, 2023 at 10:06 AM

Reply to [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700

How big do models need to be until certain capabilities emerge? That is the actual question here, isn't it? Do smaller models perform as well in all tasks, or just the one they are trained for?

LeN3rd t1_jdhecq5 wrote on March 24, 2023 at 12:36 PM

Reply to comment by JimiSlew3 in [D] Simple Questions Thread by AutoModerator

I dont think so. OpenAI has overtaken any research done on LLMs by a long shot.

LeN3rd t1_jdhe9qb wrote on March 24, 2023 at 12:35 PM

Reply to comment by dotnethero in [D] Simple Questions Thread by AutoModerator

What language/suite are you using? You can take a look at profilers in your language. I know Tensorflow has some profiling tools and you can look at what operations are running on what device. Probably Torch has some as well. If its more esoteric, just use general language profilers and take a look at what your code is doing most of the time.

LeN3rd t1_jdhe09x wrote on March 24, 2023 at 12:33 PM

Reply to comment by kross00 in [D] Simple Questions Thread by AutoModerator

From what i have heard, it should be possible. But only with the 7B model. Unless you own a few A/H 100s.

LeN3rd t1_jct6arv wrote on March 19, 2023 at 11:14 AM

Reply to comment by DreamMidnight in [D] Simple Questions Thread by AutoModerator

Ok, so all of these are linear ( logistics) regression models, for which it makes sense to have more data points, because the weights aren't as constraint as in a convolutional layer I.e. but it is still a rule of thumb, not exactly a proof.

LeN3rd t1_jcitswg wrote on March 17, 2023 at 3:22 AM

Reply to comment by Batteredcode in [D] Simple Questions Thread by AutoModerator

The problem with your VAE idea is, that you cannot apply the usual loss function of having the difference between the input and the output, and thous a lot of nice theoretical constraints go out of the window afaik.

https://jaan.io/what-is-variational-autoencoder-vae-tutorial/

I would start with a cycleGAN:

https://machinelearningmastery.com/what-is-cyclegan/

Its a little older, but i personally know it a bit better than diffusion methods.

With the free to use StableDiffusion model you could use it to conditionally inpaint on your image, though you would have to describe what is on that image in text. You could also train your own diffusion model, though you need a lot of training time. Not necessarily more than a GAN, but still.

It works by adding noise to an image, and then denoising it again and again. For inpainting you just do that for the regions you want to inpaint (your R and G channel), and for the regions you wanna stay the same as your original image, you just take the noise that you already know.

LeN3rd t1_jcislrk wrote on March 17, 2023 at 3:11 AM

Reply to comment by DreamMidnight in [D] Simple Questions Thread by AutoModerator

I have not heard this before. Where is it from? I know that you should have more datapoints than parameters in classical models.

LeN3rd t1_jchkhuv wrote on March 16, 2023 at 9:47 PM

Reply to comment by No_Complaint_1304 in [D] Simple Questions Thread by AutoModerator

What you can try is to start with linear or log Regression and try to learn on Wikipedia. That might be fun and give you decent results.

LeN3rd t1_jchht71 wrote on March 16, 2023 at 9:29 PM

Reply to comment by ilrazziatore in [D] Simple Questions Thread by AutoModerator

Than I would just use a completely different test dataset. In a paper I would also expect this.

LeN3rd t1_jcgzk3c wrote on March 16, 2023 at 7:30 PM

Reply to comment by ilrazziatore in [D] Simple Questions Thread by AutoModerator

If it is model uncertainty, the bnn should only assume distributions only for the model parameters, no? If you make the samples a distribution, you assume data uncertainty. Also I do not know exactly what you other model gives you, but as long as you get variances, I would just compare those at first. If the models give vastly different means, you should take that into account. There is probably some nice way to add this ensemble uncertainty with the uncertainty of the models. Also this strongly means that one model is biased and does jot give you a correct estimate of the model uncertainty.

LeN3rd t1_jcguoiy wrote on March 16, 2023 at 6:59 PM

Reply to comment by No_Complaint_1304 in [D] Simple Questions Thread by AutoModerator

I didn't mean to discourage you. Its a fascinating field, but it is its own field of research for a reason. Start with BERT and see where that gets you.

These ones are also a nice small watch:

https://www.youtube.com/watch?v=gQddtTdmG_8

https://www.youtube.com/watch?v=rURRYI66E54

LeN3rd t1_jcgu1z5 wrote on March 16, 2023 at 6:56 PM

Reply to comment by Batteredcode in [D] Simple Questions Thread by AutoModerator

This is possible in multiple ways. Old methods for this would be to view this as an inverse problem and apply some optimization method to it, like ADMM or FISTA.

If lots of data is missing (in your case the complete R&G channels) you should use a neural network for this. You are on the right track, though it could get hairy. If you have a prior (You have a dataset and you want it to work on similar images), a (cycle) GAN, or a retrained Stable diffusion model could work.

I am unsure about VAEs for your problem, since you usually train them by having the same input and output. You shouldn't enforce the latent to be only the blue channel, since the the encoder is useless. Training only the decoder site is essentially what GANs and diffusion networks do so i would start there.

LeN3rd t1_jcgsjxq wrote on March 16, 2023 at 6:46 PM

Reply to comment by ilrazziatore in [D] Simple Questions Thread by AutoModerator

define probabilistic. Is it model uncertainty, or data uncertainty? Either way you should get a standard deviation from your model (either as an output parameter, or implicitly by ensembles), that you can compare.

LeN3rd t1_jcgs51v wrote on March 16, 2023 at 6:44 PM

Reply to comment by wikipedia_answer_bot in [D] Simple Questions Thread by AutoModerator

don't hurt me

LeN3rd t1_jcgrxfp wrote on March 16, 2023 at 6:42 PM

Reply to comment by denxiaopin in [D] Simple Questions Thread by AutoModerator

Strongly depends on your constraints. There are ways to get 3d geometry from a photo/video. If you have the geometry of your glasses you should be able to see if they fit, though you might have some problems with actually adjusting the glasses to fit on the face geometry. But you could also just do what you optician does and take a frontal photo of your face in a controlled environment.

LeN3rd t1_jcgrhlm wrote on March 16, 2023 at 6:40 PM

Reply to comment by towsif110 in [D] Simple Questions Thread by AutoModerator

Be a little more coherent in your question please. No one has any idea about your specific setup unless you tell us what you want to achieve. I.e. RF is usually short for reinforcement learning in the AI community, not radiofrequency. If you want to classify data streams coming from drones, take a look at pattern matching and nearest neighbour methods, before you start to train up a large neural network.

LeN3rd t1_jcgqzvo wrote on March 16, 2023 at 6:37 PM

Reply to comment by DreamMidnight in [D] Simple Questions Thread by AutoModerator

If you have more variables than datapoints, you will run into problems, if your model starts learning by heart. Your models overfits to the training data: https://en.wikipedia.org/wiki/Overfitting

You can either reduce the number of parameters in your model, or apply a prior (a constraint on your model parameters) to improve test dataset performance.

Since neural networks (the standard emperical machine learning tools nowadays) have a structure for their parameters, this means they can have much more parameters than simple linear regression models, but seem to run into problems, when the number of parameters in the network matches the number of datapoints. This is just empirically shown, i do not know any mathematical proves for it.

LeN3rd t1_jcgq97y wrote on March 16, 2023 at 6:32 PM

Reply to comment by No_Complaint_1304 in [D] Simple Questions Thread by AutoModerator

You will need more than a week. If you just want to predict the next word in a sentence, take a look at large language models. ChatGPT being one of them. BERT is a research alternative afaik. If you aim to learn the probabilities yourself, you will need at least a few months.

In general what you want is a generative model that can sample from the conditional probability distribution. In sequences usually transformers like BERT and chatgpt are state of the art. You can also take a look at normalizing flows and diffusion models to learn probability distributions. But this needs some maths, and i unfortunatly do not know what smaller models can be used for computational linguistic applications like this.

LeN3rd t1_jcgp44s wrote on March 16, 2023 at 6:25 PM

Reply to comment by Sonicxc in [D] Simple Questions Thread by AutoModerator

How big is your dataset? Before you start anything wild, i would look at kernel clustering methods. Or even clustering without kernels. Just cluster your broken and non broken images and calculate some distance (can be done with kernels if it needs to be nonlinear).

Also Nearest neighbor could work pretty well in your case. Just compare your new image to the closest (according to some metric) in your two datasets and bobs your uncle.

If you need a number, look at simple CNNs. you need more training data though for this to work well.

LeN3rd t1_jcgo5ro wrote on March 16, 2023 at 6:19 PM

Reply to comment by PhysZhongli in [D] Simple Questions Thread by AutoModerator

You should take a look at uncertainty in general. What you are trying to do is calculate epistemic uncertainty. (google epistemic vs aleatoric uncertainty).

One thing that works well is to have a dropout layer, that is active during prediction!! (in tensorflow you have to feed training=True into the call to activate it during prediction). Sample like 100 times and calculate the standard deviation. This gives you a general "i do not know" function from the network. You can also do so by training 20 models and letting them output 20 different results. With this you can assign the 101 label, when the uncertainty is too high.

In my experience you should stay away from bayesian neural networks, since the are extremly hard to train, and cannot model multimodal uncertainty. (dropout can neither, but is WAAAAYYY easier to train).

LeN3rd t1_jcgn73n wrote on March 16, 2023 at 6:13 PM

Reply to [D] Simple Questions Thread by AutoModerator

Can anyone recommend a good, maintained and well organized MCMC python package? Everything i found was either not maintained, had only a single research group behind it, or had to many bugs for me to continue with that project. I want Tensorflow/Pytorch, but for MCMC sampling please.

LeN3rd t1_jbzaooe wrote on March 12, 2023 at 9:51 PM

Reply to comment by Ridley_Himself in As they still have a neutral charge, can antineutrons replace neutrons in a regular atom? by Oheligud

Wouldn't that leave Single Quarks? I thought that was a nono