“The Welfare of Digital Minds: A Research Agenda” by Derek Shiller, Bob Fischer, Hayley Clatterbuck, arvomm, David_Moss
EA Forum Podcast (All audio) - Un podcast de EA Forum Team
Catégories:
Rethink Priorities' Worldview Investigations Team is sharing research agendas that provide overviews of some key research areas as well as projects in those areas that we would be keen to pursue. This is the second of three such agendas to be posted in November 2024. It describes our view of the digital welfare research landscape and the kinds of questions that we’re interested in taking up. Our team is currently working on a model of the probabilities of consciousness in contemporary AI systems that touches on a variety of the questions posed here. We plan to release research related to the issues raised by that project as we go. We're interested in other projects related to the questions below, and we have some specific ideas about what we'd like to work on, but this is a fast-moving space and so we would recommend that prospective donors reach out [...] ---Outline:(01:02) Introduction(06:03) 1. Questions of Systems(08:06) Significant system divisions(08:16) Deep Learning Models vs Other AI Paradigms(09:33) Transformers vs Other Deep Learning Models(10:35) Training Phases vs Deployment Phases(11:57) Scaffolded vs Pure Models(13:07) General vs Task-Specific Models(14:29) Embodied, Agentic, and Oracle Models(15:59) Digital vs Neuromorphic Implementation(16:31) Sample questions(17:37) 2. Questions of Capabilities and Mechanisms(20:12) Sample questions(20:19) Do LLMs have access to their own internal processes? Can they tell what is going on inside of themselves?(21:09) Do LLMs or other deep learning models have goals? Are their actions influenced by representations with a world-to-mind direction of fit?(21:50) For models with goals, do they have decision-making processes? Do they weigh different considerations for alternative options and contrast them with each other?(22:18) To what extent do AIs (LLMs or other deep learning models) use internal models for generating predictions? Are there many such models within a single system, and if so, do they have a heterogeneous structure? What aspects of training encourage the formation of such models?(23:13) What are the limitations on deep learning model creativity?(23:57) Do LLMs have special representations for the true relationships between concepts as opposed to false but widely believed relationships?(24:56) How do LLM or other deep learning model representations map onto the conceptual/non-conceptual distinction from the philosophy of mind?(25:36) To what extent can LLMs reason about possibilities in order to choose text continuations? Are the limits of that reasoning constrained to the structure of outputted text?(26:54) 3. Questions of Cognitive Assessment(28:14) Important traits(29:36) Sample questions(29:46) Is the case for AI possessing this trait primarily dependent on computational functionalism?(30:22) How plausible is computational functionalism for this trait, as opposed to other nearby theories (e.g., non-computationalist functionalism), or other kinds of metaphysical theories altogether?(31:08) What theories of this trait are taken seriously by experts? Which are most plausible?(32:10) How well is the space of viable theories of this trait covered by existing proposals? If we just evaluate systems according to the specifics of existing proposals, are we likely to be missing out on a number of other equally viable theories?(33:07) What should we think of minimal implementations of each theory's proposed criteria? How do we judge when we have substantial enough implementation to truly be conscious?(34:06) For each of the major theories, what capacities or mechanisms are most important to look for? Are they satisfied by any existing models? Should we expect them to be soon?(34:59) Are there any trends among popular theories that might be useful for assessing the presence of the trait even if we are skeptical that any theory is precisely correct?(35:49) Are there plausible theory-independent indicators / proxies for this trait?(36:24) Does this trait exist categorically or does it have a vague / indeterminate extension?(37:04) 4. Questions of Value(38:46) Sample questions on the value of consciousness(39:10) Is consciousness intrinsically valuable?(40:09) Is it simply wrong to end the lives of conscious beings?(41:00) Could non-conscious beings have welfare?(41:29) Could non-conscious mental states contribute to welfare?(42:04) Are there other states in the ballpark of consciousness that might matter?(42:47) What should we think about the value of indeterminate or partial consciousness?(43:34) Sample questions on the value of valenced experiences(44:43) What determines the value of valenced experiences?(45:56) How should we trade off between positive and negative experiences?(46:55) How should we distinguish positive from negative valenced experiences?(48:23) What is the connection between temporal duration and value?(49:45) How do we value experiences that are shared between minds?(50:40) Sample questions on the value of desires(51:12) Do all desires matter or only specific kinds? Does it matter whether those desires are conscious? Does it matter if they are reflected upon or otherwise endorsed?(51:49) How do we weigh desires?(52:35) Other welfare contributors(53:48) 5. Questions of Domain Epistemology(55:17) Sample questions(55:24) Is expert opinion biased toward any specific answers?(56:43) How confident are experts about our best answers to the relevant questions?(57:32) How should we aggregate our uncertainties across different questions into an overall assessment of the moral status of AI systems?(58:18) AcknowledgmentsThe original text contained 2 images which were described by AI. --- First published: November 11th, 2024 Source: https://forum.effectivealtruism.org/posts/XxvyqQ49ASTAytBM2/the-welfare-of-digital-minds-a-research-agenda --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.