Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with step and reset in TempVMPackingEnv #36

Open
joefarrington opened this issue May 5, 2023 · 1 comment
Open

Issue with step and reset in TempVMPackingEnv #36

joefarrington opened this issue May 5, 2023 · 1 comment

Comments

@joefarrington
Copy link

Hi team,

Awesome work, really like the paper.

I've found a couple of issues with the TempVMPackingEnv environment (VMPacking-v1).

The main one is that the custom logic of the step method was definied as step() instead of _STEP() and so steps in the environment were actually using _STEP() for the parent class. There's is also a small indexing error when demand is being removed from a PM, and I think it would be helpful to also round down any remaining use below tolerance to zero.

There was also an issue in the _RESET() method, where state was being created as an array istead of a dict, and so updates to the state, based on dict keys, did not work.

Proposed fixes below, with a comment followed by my initials (JMF) and an explanation for each changed/added line. Happy to raise a pull request if helpful.

   def _STEP(self, action): # JMF: _STEP instead of step, otherwise _STEP from parent used
        done = False
        pm_state = self.state["state"][:-1]
        demand = self.state["state"][-1, 1:]
        
        if action < 0 or action >= self.n_pms:
            raise ValueError("Invalid action: {}".format(action))
            
        elif any(pm_state[action, 1:] + demand > 1 + self.tol):
            # Demand doesn't fit into PM
            reward = -1000
            done = True
        else:
            if pm_state[action, 0] == 0:
                # Open PM if closed
                pm_state[action, 0] = 1
            pm_state[action, self.load_idx] += demand
            reward = np.sum(pm_state[:, 0] * (pm_state[:,1:].sum(axis=1) - 2))
            self.assignment[self.current_step] = action

        # Remove processes
        if self.current_step in self.durations.values():
            for process in self.durations.keys():
                # Remove process from PM
                if self.durations[process] == self.current_step:
                    pm = self.assignment[process] # Find PM where process was assigned
                    pm_state[pm, self.load_idx] -= self.demand[process][1:] # JMF: Index to exclude first element of demand array
                    pm_state[pm, self.load_idx] = np.where(pm_state[pm, self.load_idx]<self.tol, 0., pm_state[pm, self.load_idx]) # JMF: Deal with rounding
                    # Shut down PM's if state is 0
                    if pm_state[pm, self.load_idx].sum() == 0:
                        pm_state[pm, 0] = 0
            
        self.current_step += 1
        if self.current_step >= self.step_limit:
            done = True
        self.update_state(pm_state)
        return self.state, reward, done, {}
    def _RESET(self):
        self.current_step = 0
        self.assignment = {}
        self.demand = self.generate_demand()
        self.durations = generate_durations(self.demand)
        self.state = {
            "action_mask": np.ones(self.n_pms, dtype=np.uint8),
            "avail_actions": np.ones(self.n_pms, dtype=np.uint8),
            "state": np.vstack([
                np.zeros((self.n_pms, 3)),
                self.demand[self.current_step]],
                dtype=np.float32)
        } # JMF: Dict instead on NumPy array, otherwise update_state doesn't work
        return self.state
@preethiSH-cmrit
Copy link

Could you please let me know how to use VMPacking-v1 or VMPacking-v0. I'm not sure how to set up the env_config for this. Any help would be greatly appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants