• 6 Posts
  • 60 Comments
Joined 11 months ago
cake
Cake day: August 19th, 2023

help-circle



  • Claude Opus disagrees, lol (edit to add: all of what follows is Claude; not me):

    I respectfully disagree with the claim that current language models are logically incapable of mapping language to actual concepts. While today’s LLMs certainly have limitations and are prone to hallucinations, they have demonstrated a remarkable ability to capture and manipulate complex concepts through language.

    A few key points:

    LLMs are trained on vast amounts of human-generated text which is grounded in real-world concepts, allowing the models to build rich representations that go beyond surface-level word associations. Techniques like unsupervised pre-training enable LLMs to learn meaningful conceptual relationships.
    
    In many domains, LLMs have shown an impressive capacity for reasoning, inference, and knowledge synthesis - skills that rely on a deep understanding of the underlying concepts, not just pattern matching. Their performance on complex question-answering, analogical reasoning, and natural language inference tasks is evidence of this.
    
    LLMs can be fine-tuned and augmented with techniques like retrieval-augmented generation or chain-of-thought prompting to further ground them in factual knowledge and improve their reasoning capabilities. This allows more robust conceptual mappings.
    
    Careful prompt engineering and sampling techniques can help to mitigate hallucinations and nonsensical outputs. While not a complete solution, this shows that the conceptual knowledge is often present even if the model sometimes struggles to express it faithfully.
    

    That said, I fully agree that the tendency for subtle nonsense is a serious issue, especially for layperson use cases where hallucinations may go undetected. Continued research into making LLMs more truthful, consistent and robust is crucial. Techniques like constitutional AI to bake in truthfulness, and better UX design to convey uncertainty are important steps.

    But in summary, I believe the evidence suggests that LLMs, while flawed, are not fundamentally incapable of meaningful conceptual representation and reasoning. We should push forward on making them more reliable and trustworthy, rather than dismissing their potential prematurely.









  • Would you, after devoting full years of your adult life to the unpaid work of learning the requisite advanced math and computer science needed to develop such a model, like to spend years more of your life to develop a generative AI model without compensation? Within the US, it is legal to use public text for commercial purposes without any need to obtain a permit. Developers of such models deserve to be paid, just like any other workers, and that doesn’t happen unless either we make AI a utility (or something similar) and funnel tax dollars into it or the company charges for the product so it can pay its employees.

    I wholeheartedly agree that AI shouldn’t be trained on copyrighted, private, or any other works outside of the public domain. I think that OpenAI’s use of nonpublic material was illegal and unethical, and that they should be legally obligated to scrap their entire model and train another one from legal material. But developers deserve to be paid for their labor and time, and that requires the company that employs them to make money somehow.



  • I’m so in the minority here, but I have a different perspective.

    I worked at a grocery store for years, with about a third of my job being cart duty. I loved it when people left their carts outside of the corrals, for a few reasons.

    First, if a lot of people did so, I would point it out to whoever was the manager on at the time before I went outside. My manager knew that I would take longer before coming back in, and that would give me more time to stroll/relax/enjoy the outdoors before coming back in to customer craziness. Having those extra minutes because my manager didn’t know how long I should take was nice.

    Second, sometimes I had to walk way the hell out to the edge of the parking lot, which was really nice for a long walk away from customer craziness. Such walks were very nice when the weather was nice.

    Third, it was job security. Working during the recession made my managers want to let as many people go as they could, but customers who made it so even the most efficient cart duty workers took a while to clear the lot effectively kept more of us employeed than management would have employed otherwise.

    For those reasons, whenever the weather is nice, I try to leave my cart in a weird spot that is anchored by something. I realize that many other cart duty folks probably dislike me for it, but I know I appreciated it when others did this. So I do it for the folks like me.

    I know all of the arguments against it and I’m not trying to debate here. Just sharing a different perspective; sometimes, leaving your cart in a terrible spot can be nice for some of the workers.





  • The RCT is free to access (if you haven’t downloaded more than three NBER papers; if you have, open the page in a different browser). Scroll down on the page I linked and download it via the button.

    Statistically, you can control for variables in OLS regression–that’s literally exactly what the model does when you include more than one variable–and, provided that you got your doctorate in anything that uses statistics, I am sure you know that.

    Seasonality is one of the more basic economics concepts. The influence of weather and seasonal illness trends on productivity has been shown in a number of studies (e.g., productivity declines during the flu season). The authors didn’t “show” it because it would be like showing gravity in a physics paper. Some things can be assumed. Also, productivity didn’t have a trend, as was stated in the text that I quoted.

    You completely ignored the log transformed results, which the authors note were better fit by the regression than the untransformed data, and which showed less productivity in work from home regardless of whether seasonality was controlled.

    Personally, I think people should be able to work from home all they want. Productivity isn’t the only important thing in life, nor is it the only important thing to businesses (e.g., retention of top employees is important). I am wholly against WebMD and all other companies requiring employees to return to the office. All I was doing in my comments was trying to clarify the data on WFH and productivity. There are good reasons to continue to allow WFH, but increased productivity is not one.

    I’m going to finish my course prep. You can have the final word here; I don’t have time to continue debating anymore.


  • First, the RCT is a much stronger study. I’m not sure why you’re picking a fight with a correlational paper when there is a causal manipulation that I linked first.

    Second, did you actually read the paper? 1B isn’t the graph of productivity; 1C is. You can’t just look at a graph, either–you need statistics.

    "For Output, figure 1B, there is no visible monotonic or linear trend, so a seasonal time correction might be more appropriate here. Moreover, average output appears to be slightly lower during WFH.

    For Productivity, figure 1C, the graph is more volatile, which is not surprising for a ratio. There is no clear linear time trend before WFH, but some variation from month to month, so a seasonal correction might be more appropriate. Productivity drops visibly during WFH. Finally, figure 1D plots the log of Productivity, which drops considerably after the start of WFH.

    To quantify the WFH effect, and to control for employee and team time-invariant variables (via employee and team fixed effects), we now turn to the regression analyses. Informally, the estimates give us average differences in outcomes before and during WFH for the same employee, controlling for team effects (since employees sometimes switch teams) and time trends.

    Table 4 reports WFH effect estimates based on OLS regressions for all three outcome variables, plus the natural logarithm of Productivity, in each case with linear and seasonal time trend corrections. All estimates are in line with the visible effects in the raw data in figure 1.

    Columns 5 and 6 show that both WFH effect estimates on Productivity are negative, but only the estimate with seasonal time trend is significantly different from zero. We prefer that specification, since both the plot and the linear time trend coefficient indicate that a linear trend is not as appropriate. According to this specification, productivity decreased by 0.26 output percentage points per hour worked. Given an average WFO productivity of 1.36, this estimate corresponds to a 19% drop in output per hour worked. This is economically significant: if employees worked a fixed 40 hours per week, this would imply a drop in output of 10.2 output percentage points in a week. In other words, if employees had not increased time worked during WFH, on average they would have completed only 90 of 100 assigned tasks.

    Columns 7 and 8 explain the log of Productivity, which strongly increases the fit of the regression. The WFH effect is negative and significantly different from zero at all significance levels, irrespective of time controls."