1. 研究目的与意义(文献综述包含参考文献)
1.1 Introduction: The goal of this research is represented some kind of suitable function that will help us to get the evolutionary and hill-climbing algorithms. The function also known as like error functions, loss functions, objective functions are a key parameter of most approaches to machine leaning aspects. These are problem what is specify the scientific functions that how long a point in the search space comes real solution closer to hand.Due to circumstances solving the Rubiks cube on of the Deep Reinforcement Learning. Rubiks cube has enormous combination involving approximately 4.3x1019 possibilities to the configurations. It s mathematical approaches provided by the Rubiks team. That define how the layers rotate in search space. When it is solved configuration the cube, as a geometrical object show symmetries which are broken when driven away from this configuration. When all possible symmetric are gained, the configuration of Rubiks cube matches the solution of game. We make use of a Deep Reinforcement Learning algorithm based on possible configurations. This error problem mainly studies the application of deep reinforcement learning to the restoration and combination optimization of Rubiks cube. Third order(3x3) Rubiks cube is a classical combinational optimization problem. Due to target state, most of the current reduction algorithms are based on human knowledge, for instance group theory and algebraic abstraction. At present, deep learning and reinforcement learning are in a period of rapid development, and the deep reinforcement learning produce by combination of the two is even more effective. The machine war between Alpha go and Li Shi in 2016 left deep impression on people. Therefore, this topic hopes to solve the problem of space return of traditional deep reinforcement learning such as Alpha go, Rubiks cube. Finally, the algorithm is implanted on the double arm magic cube robot. While verifying the algorithm, the algorithm is optimized two in aspects: the success rate and the number of steps. By accumulating these ideas and scientific aspects will improve to search and optimization algorithms in many different problems.-The Rubiks cube consists of 26 smaller cubes called cubelets. These are classified by their stickers, count: center, edge, and corner cubelets have 1, 2, and 3 stickers attached respectively. There are 54 stickers in total with each uniquely identifiable based on the of cubelets the sticker is on and other on the cubelet. 1.2 Research Literature Review:There are plenty of research and literature of this topic. The best way to improve of this project need to review them as much as I can. I concern that obtain these formulations and ideas will help me finish my project. So I would like to show you few Researches and literatures work here. 1.2.1 Solving the Rubiks cube via quantum mechanics of deep reinforcement learning by Corli, Sebastino, Et al from Italy made great contribution it.Figure 1: (a) the different kind of cubies composing the Cube: in light blue, in pink and in grey the edges, the corners and the centrals respectively. The arrows define the arbitrary choice of orientation for the solved state. Two different colors are applied for corners (pink) and edges (blue). (b) The faces of the Cube with their conventional color blue, red and white. Coordinated axes are introduced to describe the positions of the cubies, and the choice for the orientation in the solved configuration for the sole corners. (c) The same coordinated axes to describe the position of the cubies and the choice of orientation for the edges in the solved configuration.1.2.2 Solving the Rubiks cube stepwise deep learning by Colin G. Jonson. He wanted to express by solving the Rubiks cube algorithms come up with difficult problems solution. The way he thinks that make another approach of deep reinforcement learning algorithm. Now is time to check his research field. In Johnson (2018) we applied the LGF (Learned Guidance Function) to the problem of unscrambling the Rubik's Cube. We used a number of classifiers from the scikit-learn library (scikit-learn, n.d.) to implement LGFs, and demonstrated that (1) the LGF can learn to recognize the number of turns that have been made to a cube to a decent level of accuracy; and, (2) that this LGF can then be used to unscramble particular states of the cube in a sensible number of moves. Unscrambling is not one of the non-oracular problems, because the goal state is known, but it has a complex fitness landscape with many local minima, and so is a good test for these kinds of algorithms. The search space C consists of all possible configurations of colored face lets on the six faces of the cube, each of which has a 3 3 set of face lets. The move set M is notated by a list of twelve 90_ moves, (Singmaster, 1981), which are functions from C to C. We use the notation m(c) to denote the application of move, M to the cube c, returning the new state of the cube.An earlier paper by the author (Johnson, 2018) applied a number of learning algorithms to the problem of learning LGFs for the Rubik's cube, with random forests demonstrating itself to be the best approach. That paper did not use any deep learning (Goodfellow et al., 2017) approachesin this paper, we extend the work by using deep learning.The Cube is composed by 26 smaller cubes, called cubies. Six of them show one face (the centrals), which cannot be moved from their relative positions, 12 edges show two faces and 8 corners show three faces (the corners). A visual description is shown in Figure 1. A corner cubie cannot take the place of an edge one, and vice versa. Any configuration of the Cube can be univocally identified by two sets of features, namely the positions and orientations of the cubies. The position of a cubie marks how far a cubie is from its place in the solved configuration. The orientation of a cubie stores how, keeping the cubie in its solved position, it has been rotated around a rotation axis suitable to induce permutations. Orientation is graphically represented as an arrow on a face of the cubie, as in Figure 1. Orientations and positions are modified by rotating the layers of the Cube. These transformations are elements from Rubiks group (R), which consists of six generators, each of them corresponding tothe rotation of a colored face. where U, D are the rotations of the upper and lower faces, F, B front, back, L, R left, right respectively. To describe the orientation and position of each cubie, one may set an orientation for the solved configuration and three cartesian axes (Figure 1). The geometry of orientation is affected by the number of faces displayed by the cubie. The ways in which a cubie can be oriented are the way the same cubie can be put in a fixed position by rotating its own faces. Any allowed configuration can be mapped from the solved one, once an algebraic representation for the Rubiks group is given. The six cubies fixed in the middle of the facesare called centrals. They are invariant under all the transformations and show just one face, thus they are not associated to any specific orientation. Edge cubies show two faces, thus two1.3 Implication:We are currently further improving Deep Cube by extending it to harder cubes. Autodidactic Iteration can be used to train a network to solve a 3x3 cube and other puzzles such as n-dimensional sequential move puzzles and combination puzzles involving other poly-complexities. Besides further work with the Rubiks Cube, we are working on extending this method to find approximate solutions to other combinatorial optimization problems such as prediction of protein tertiary structure. Many combinatorial optimization problems can be thought of as sequential decisions making problems, in which case we can use reinforcement learning. Bello et. al. train an RNN through policy gradients to solve simple traveling salesman and knapsack problems . We believe that harnessing search will lead to better reinforcement learning approaches for combinatorial optimization. For example, in protein folding, we can think of sequentially placing each amino acid in a 3D lattice at each timestep. If we have a model of the environment, ADI can be used to train a value function which looks at a partially completed state and predicts the future reward when finished. This value function can then be combined with MCTS (Morte Carlio Tree System) to find approximately optimal conformations. Lon Bottou defines reasoning as "algebraically manipulating previously acquired knowledge in order to answer a new question"[5]. Many machine learning algorithms do not reason about problems but instead use pattern recognition to perform tasks that are intuitive to humans, such as object recognition. By combining neural networks with symbolic AI, we are able to create algorithms which are able to distill complex environments into knowledge and then reason about that knowledge to solve a problem. DeepCube is able to teach itself how to reason in order to solve a complex environment with only one reward state using pure reinforcement learning.1.2 Research Literature ReviewThere are plenty of research and literature of this topic. The best way to improve of this project need to review them as much as I can. I concern that obtain these formulations and ideas will help me finish my project. So I would like to show you few Researches and literatures work here. 1.2.1 Solving the Rubiks cube via quantum mechanics of deep reinforcement learning by Corli, Sebastino, Et al from Italy made great contribution it.Figure 1: (a) the different kind of cubies composing the Cube: in light blue, in pink and in grey the edges, the corners and the centrals respectively. The arrows define the arbitrary choice of orientation for the solved state. Two different colors are applied for corners (pink) and edges (blue). (b) The faces of the Cube with their conventional color blue, red and white. Coordinated axes are introduced to describe the positions of the cubies, and the choice for the orientation in the solved configuration for the sole corners. (c) The same coordinated axes to describe the position of the cubies and the choice of orientation for the edges in the solved configuration.1.2.2 Solving the Rubiks cube stepwise deep learning by Colin G. Jonson. He wanted to express by solving the Rubiks cube algorithms come up with difficult problems solution. The way he thinks that make another approach of deep reinforcement learning algorithm. Now is time to check his research field. In Johnson (2018) we applied the LGF (Learned Guidance Function) to the problem of unscrambling the Rubik's Cube. We used a number of classifiers from the scikit-learn library (scikit-learn, n.d.) to implement LGFs, and demonstrated that (1) the LGF can learn to recognize the number of turns that have been made to a cube to a decent level of accuracy; and, (2) that this LGF can then be used to unscramble particular states of the cube in a sensible number of moves. Unscrambling is not one of the non-oracular problems, because the goal state is known, but it has a complex fitness landscape with many local minima, and so is a good test for these kinds of algorithms. The search space C consists of all possible configurations of colored face lets on the six faces of the cube, each of which has a 3 3 set of face lets. The move set M is notated by a list of twelve 90_ moves, (Singmaster, 1981), which are functions from C to C. We use the notation m(c) to denote the application of move, M to the cube c, returning the new state of the cube.An earlier paper by the author (Johnson, 2018) applied a number of learning algorithms to the problem of learning LGFs for the Rubik's cube, with random forests demonstrating itself to be the best approach. That paper did not use any deep learning (Goodfellow et al., 2017) approachesin this paper, we extend the work by using deep learning. The Cube is composed by 26 smaller cubes, called cubies. Six of them show one face (the centrals), which cannot be moved from their relative positions, 12 edges show two faces and 8 corners show three faces (the corners). A visual description is shown in Figure 1. A corner cubie cannot take the place of an edge one, and vice versa. Any configuration of the Cube can be univocally identified by two sets of features, namely the positions and orientations of the cubies. The position of a cubie marks how far a cubie is from its place in the solved configuration. The orientation of a cubie stores how, keeping the cubie in its solved position, it has been rotated around a rotation axis suitable to induce permutations. Orientation is graphically represented as an arrow on a face of the cubie, as in Figure 1. Orientations and positions are modified by rotating the layers of the Cube. These transformations are elements from Rubiks group (R), which consists of six generators, each of them corresponding tothe rotation of a colored face. where U, D are the rotations of the upper and lower faces, F, B front, back, L, R left, right respectively. To describe the orientation and position of each cubie, one may set an orientation for the solved configuration and three cartesian axes (Figure 1). The geometry of orientation is affected by the number of faces displayed by the cubie. The ways in which a cubie can be oriented are the way the same cubie can be put in a fixed position by rotating its own faces. Any allowed configuration can be mapped from the solved one, once an algebraic representation for the Rubiks group is given. The six cubies fixed in the middle of the facesare called centrals. They are invariant under all the transformations and show just one face, thus they are not associated to any specific orientation. Edge cubies show two faces, thus two1.3 Implication:We are currently further improving Deep Cube by extending it to harder cubes. Autodidactic Iteration can be used to train a network to solve a 3x3 cube and other puzzles such as n-dimensional sequential move puzzles and combination puzzles involving other poly-complexities. Besides further work with the Rubiks Cube, we are working on extending this method to find approximate solutions to other combinatorial optimization problems such as prediction of protein tertiary structure. Many combinatorial optimization problems can be thought of as sequential decisions making problems, in which case we can use reinforcement learning. Bello et. al. train an RNN through policy gradients to solve simple traveling salesman and knapsack problems . We believe that harnessing search will lead to better reinforcement learning approaches for combinatorial optimization. For example, in protein folding, we can think of sequentially placing each amino acid in a 3D lattice at each timestep. If we have a model of the environment, ADI can be used to train a value function which looks at a partially completed state and predicts the future reward when finished. This value function can then be combined with MCTS (Morte Carlio Tree System) to find approximately optimal conformations. Lon Bottou defines reasoning as "algebraically manipulating previously acquired knowledge in order to answer a new question"[5]. Many machine learning algorithms do not reason about problems but instead use pattern recognition to perform tasks that are intuitive to humans, such as object recognition. By combining neural networks with symbolic AI, we are able to create algorithms which are able to distill complex environments into knowledge and then reason about that knowledge to solve a problem. DeepCube is able to teach itself how to reason in order to solve a complex environment with only one reward state using pure reinforcement learning.
2. 研究的基本内容、问题解决措施及方案
1.1 Project Design and Planning:This project consists of two crucial part these are analytical research based on which will be done in experiments which is practical. Most of the thought will collect from various research paper and according to those researches it will illustrate the fundamental workspace that I going to cover in it. The workflow of this project will be provided by many areas and that should need to be planning and design basis. Based on this project it has two main criteria one is design the two arms robot that will solve the (3x3) Rubiks cube. For the purpose I am going to learn basic function of the robot manipulator and some CPU technique and the image processing technique other will be included if I need them. For the designing purpose I will use Solidwork software. Some work will be done virtually and some of them will be actually.1.1.1 Data Collection and Preparation:Data collection is one of the important works for the project and scientific purpose. Each data will collect from the based on working procedure and other research and literature review. During the project work it is necessary to take note for each problem that will help in my further steps. Overall data will be collected from those areas that I have mentioned earlier. 1.3.2 Hardware settings:Robot manipulator making: it is a challenge to make exactly precise robot which solve the rubiks cube.for me I choose a one armed will be six axis industrial robot and another armed will be one axis robot which will assist the six-axis robot. By far making the six-axis robot it will pretty convenient for me.Due to the expect, I need the actual prototype area where I can work. Such a good robotic environment where I can work properly. Most of case will be solved in experiment time.For the purpose of this project, I am going to first implement the virtual robotics experiment where included design those arm by using software. After finishing designing I will start processor techniques Such as Arduino, STM32 or PLC where I can give the instruction or command to the robot.Basically, making robot it also need to observe the sensor technique. Sensor like fpv or camera may use in this project. it is work as a data collector input device. A robust system also needs precise workflow which is my main goal to through it. I am learning python as a computer vision or image processing tools for the robot manipulator. Although I have some concept about MATLAB area. I think that would be enough for this project. And I am giving an example of OpenCV, you may find in below.1.3.3 Data Analysis:Data analysis will complete through MATLAB specially for graph and another OpenCV (computer vision).Here is a sample, such as collect an image and analyzing it using by OpenCV.mport cv2 as cvimg = cv.imread('lena256.bmp')cv.imshow('lena256',img)def rescaleFrame(frame, scale= 0.75):width = int(frame.shape[1]*scale)height = int(frame.shape[0]*scale)dimensions = (width,height)return cv.resize(frame,dimensions, interpolation=cv.INTER_AREA)resized_im = rescaleFrame(img)cv.imshow('lena',resized_im)cv.waitKey(0)I trying you to show how it is worked nothing else.Truth is I am not a good planner this is brief introduction of my planning. 1.3.4 Project Schedule: During this schedule the working procedure will be done. Deadline for the opening report: 2022.03.10For paper submission DATE: 2022.06.02Thank you!
课题毕业论文、文献综述、任务书、外文翻译、程序设计、图纸设计等资料可联系客服协助查找。