Finish seminar 1 notes

2025-08-18 11:11:24 +12:00
parent 574ff09929
commit da7e9dd6ef
12 changed files with 64 additions and 1 deletions
--- a/mon4thaugust/mon4thaugust.md
+++ b/mon4thaugust/mon4thaugust.md
@@ -0,0 +1,20 @@
 ## Athletic performance
 The study was using more classifical machine learning techinques to take data collected on athletes in women basketball to predict various performance indicators.
 I agree with the limitations they posed where I think the most limiting factor of this study is how hand tuned and manual the learning process was.
 Some questions I would of asked were:
 - Did the authors look into using a more flexibly neural network style archetecture? Specifcally that it might be able to train a base model that could then by fine tuned to differnet sports/divisions?
 - Am I correct in understanding that the model only took 8 features that were expanded to be 20 values feed into the model.
 - Lastly was there any teams given acess to this data and prediction to see what coaches atleast thought about its usefullness?
 ## Conversational diagnostics
 This study developed and explored a special (assuming just fine tuning) process to amek a LLM that is useful in "replacing" a doctor. It is interesting to hear that it did so well on many of the metrics they used, I find it matches my intuition that a model could do better than a human at diagnosis. However as pointed out in the summary this model would still be a long way away from being 'deployed' due to various eningeering and regulartory problems.
 Some questions I have are:
 - I assume that this model was fine tuned from a base LLM. What was this base llm and how much self play did it have? If it was trained from scratch how large was the model, however it seems like the training data and process is somewhat secret?
 - Was the experiment done blind if so was the patient asked for their prediction of if who they spoke with was an AI or human?
--- a/readme.md
+++ b/readme.md
@@ -10,5 +10,10 @@ Where there are two seminar sets week 5-6 and week 9-10. I only need to attend t
 ## Set 1
 ### Send to Andrew to get ticked off
 - [ ] Mon 4th august (athletic performance and conversational diagnostics)
 - [ ] Tue th august (poversity estimates and my semainr)
 ### Sent to Bach to get ticked off
--- a/tue12thaugust/legalprediction/team4_slides.pdf
+++ b/tue12thaugust/legalprediction/team4_slides.pdf
--- a/tue12thaugust/legalprediction/team4_summary.pdf
+++ b/tue12thaugust/legalprediction/team4_summary.pdf
--- a/tue12thaugust/tue12thaugust.md
+++ b/tue12thaugust/tue12thaugust.md
@@ -0,0 +1,17 @@
 ## Legal prediction
 This paper outlines an attempt to build a system that can provide judgement on legal cases in case law systemss. It works by using BERT for semantive search, then combining the current case embedding with evidence. This is then weighted and used for prediction of whether it violated or not. LLM were used to summarise the articles into shorter text which seems like a weak spot in the model as the nuance in the case can be lost.
 The quesetion I would have are:
 - What was the accuracy achieved and the F1 score?
 - What was the best top k found?
 - Was any tesets conducted on this model from a different court system?
 ## Weather
 _No summary so response written solely from slides_
 Medium range forecasting is forecasting at the one to two week range. It gets much harder to predict after the one week mark and traditional methods using physics models and such break down. Graph Neural Network based methods have performance that is on par and exceeds traditional methods. The Deep learning revolution has made its way to Weather forecasting!
 Some questions I would have:
 - Does the MLWP models always require whole earth input and output?
 - Are these NN based systems in deployment anywhere in the world?
--- a/tue12thaugust/weather/team15_slides.pdf
+++ b/tue12thaugust/weather/team15_slides.pdf
--- a/tue5thaugust/tue5thaugust.md
+++ b/tue5thaugust/tue5thaugust.md
@@ -6,4 +6,9 @@ I think in principle the idea of using earth observation and readily available i
 Some questions that I have are:
 - I assume that they used some other "off the shelf" style models or datasets to get the earth observation data. However did they consider/investigate the idea of using satalite imagery directly with Neural Networks?
- With regard to the under performance in the urban areas was it due to the data not being dense enough in that area (i.e trying to predict 10,000 people from the same size set of earth observation as opposed to only 100 people in rural areas), or was it more to do with urban areas have differnet relationships with assests and spending so less applicable of a method?
+- With regard to the under performance in the urban areas was it due to the data not being dense enough in that area (i.e trying to predict 10,000 people from the same size set of earth observation as opposed to only 100 people in rural areas), or was it more to do with urban areas have differnet relationships with assests and spending so less applicable of a method?
 # Carbon emissions
 Not completed as it was our group presenting.
--- a/wed13thaugust/headlines/team9_slideshow.pdf
+++ b/wed13thaugust/headlines/team9_slideshow.pdf
--- a/wed13thaugust/headlines/team9_summary.pdf
+++ b/wed13thaugust/headlines/team9_summary.pdf
--- a/test/team3_slides.pdf
+++ b/test/team3_slides.pdf
--- a/test/team3_summary.pdf
+++ b/test/team3_summary.pdf
--- a/wed13thaugust/wed13thaugust.md
+++ b/wed13thaugust/wed13thaugust.md
@@ -0,0 +1,16 @@
 ## AI headlines
 This paper talks about 2 studies that were conducted to test the effect of labels on peoples perceptions of new Headlines. It found with seemingly strong evidence that having a AI label on a news headlines reduced people trustworthness of the articles and willing ness to share regardless of what the headline itself was. I think the observation that people assume incorrectly full AI involvment becuase of the label and have lower trust of AI is something that will change rapidly over the next few years. As AI gets stronger the articles will be fully generated autonomously and could be stronger and more truthfull then their human made counter parts
 My questions that I would of liked to ask are:
 - For the AI news headline did they just take the first news headline given or choose out of the top 3  or such(I imagine in practice choosing from top three options or such is more realistic).
 - Was any part of the study testing peoples prediction of whether the news headline was AI generated. The results suggest people either can't tell or have no opinion but explicit testing woudl be interesting.
 - I imagine different polictical groups have different views of AI was there noticable difference in groups feeling different about AI.
 ## Turing test
 This study conducts a text based turing test. They have two tests one which is using remote workers from Prolific and another is using a smaller set of US undergraduate students. In both sets of experiments AI performed at a similar level with humans ~2 thirds detection rate. Assuming this study is regouruos enough I would consider this enough evidence to consider the simple test based turing test (i.e sit down and have shortish converstaion) to be saturated and no longer a usefull test. I think that next steps have to involve signicant time of converstaion (i.e weeks - months) and more modes like voice.
 Some questions that I would like to ask are:
 - Were there any tests of using the same AI systems to predcit whether they were speaking with a human or a AI
 - What caused the conversation to end, i.e was it a hard timer or soft timer.
 - Did the TIKTOK prompt (and others) specifically instruc the model to be deceiptive and tick the human.