• 0 Posts
  • 3 Comments
Joined 2 years ago
cake
Cake day: October 4th, 2023

help-circle

  • Of course, another option is for people to dramatically curb their use of social media, or at a minimum, regularly delete posts after a set time threshold.

    Deletion won’t deal with someone seriously-interested in harvesting stuff, because they can log it as it becomes available. And curbing use isn’t ideal.

    I mentioned before the possibility of poisoning data, like, sporadically adding some incorrect information about oneself into one’s comments. Ideally something that doesn’t impact the meaning of the comments, but would cause a computer to associate one with someone else.

    There are some other issues. My guess is that it’s probably possible to fingerprint someone to a substantial degree by the phrasing that they use. One mole in the counterintelligence portion of the FBI, Robert Hanssen, was found because on two occasions he used the unusual phrase “the purple-pissing Japanese”.

    FBI investigators later made progress during an operation where they paid disaffected Russian intelligence officers to deliver information on moles. They paid $7 million to KGB agent Aleksander Shcherbakov[48] who had access to a file on “B”. While it did not contain Hanssen’s name, among the information was an audiotape of a July 21, 1986, conversation between “B” and KGB agent Aleksander Fefelov.[49] FBI agent Michael Waguespack recognized the voice in the tape, but could not remember who it was from. Rifling through the rest of the files, they found notes of the mole using a quote from George S. Patton’s speech to the Third Army about “the purple-pissing Japanese”.[50] FBI analyst Bob King remembered Hanssen using that same quote. Waguespack listened to the tape again and recognized the voice as Hanssen’s. With the mole finally identified, locations, dates, and cases were matched with Hanssen’s activities during the period. Two fingerprints collected from a trash bag in the file were analyzed and proved to be Hanssen’s.[51][52][53]

    That might be defeated by passing text through something like an LLM to rewrite it. So, for example, to take a snippet of my above comment:

    Respond with the following text rephrased sentence by sentence, concisely written as a British computer scientist might write it:

    Deletion won’t deal with someone seriously-interested in harvesting stuff, because they can log it as it becomes available. And curbing use isn’t ideal.

    I mentioned before the possibility of poisoning data, like, sporadically adding some incorrect information about oneself into one’s comments. Ideally something that doesn’t impact the meaning of the comments, but would cause a computer to associate one with someone else.

    I get:

    The deletion of data alone will not prevent a determined party from gathering information, as they may simply record the information as it becomes available prior to its deletion. Moreover, restricting usage is not an ideal solution to the problem at hand.

    I previously mentioned the possibility of introducing deliberate errors or misinformation into one’s own data, such as periodically inserting inaccurate details about oneself within comments. The goal would be to include information that does not significantly alter the meaning of the comment, but which would cause automated systems to incorrectly associate that individual with another person.

    That might work. One would have to check the comment to make sure that it doesn’t mangle the thing to the point that it is incorrect, but it might defeat profiling based on phrasing peculiarities of a given person, especially if many users used a similar “profile” for comment re-writing.

    A second problem is that one’s interests are probably something of a fingerprint. It might be possible to use separate accounts related to separate interests — for example, instead of having one account, having an account per community or similar. That does undermine the ability to use reputation generated elsewhere (“Oh, user X has been providing helpful information for five years over in community X, so they’re likely to also be doing so in community Y”), which kind of degrades online communities, but it’s better than just dropping pseudonymity and going 4chan-style fully anonymous and completely losing reputation.


  • Meta’s chief AI scientist and Turing Award winner Yann LeCun plans to leave the company to launch his own startup focused on a different type of AI called “world models,” the Financial Times reported.

    World models are hypothetical AI systems that some AI engineers expect to develop an internal “understanding” of the physical world by learning from video and spatial data rather than text alone.

    Sounds reasonable.

    That being said, I am willing to believe that an LLM could be part of an AGI. It might well be an efficient way to incorporate a lot of knowledge about the world. Wikipedia helps provide me with a lot of knowledge, for example, though I don’t have a direct brain link to it. It’s just that I don’t expect an AGI to be an LLM.

    EDIT: Also, IIRC from past reading, Meta has separate groups aimed at near-term commercial products (and I can very much believe that there might be plenty of room for LLMs here) and aimed advanced AI. It’s not clear to me from the article whether he just wants more focus on advanced AI or whether he disagrees with an LLM focus in their afvanced AI group.

    I do think that if you’re a company building a lot of parallel compute capacity now, that to make a return on that, you need to take advantage of existing or quite near-future stuff, even if it’s not AGI. Doesn’t make sense to build a lot of compute capacity, then spend fifteen years banging on research before you have something to utilize that capacity.

    https://datacentremagazine.com/news/why-is-meta-investing-600bn-in-ai-data-centres

    Meta reveals US$600bn plan to build AI data centres, expand energy projects and fund local programmes through 2028

    So Meta probably cannot only be doing AGI work.